Architecture ============ This document provides a comprehensive overview of SATHI's system architecture, data models, and design patterns. .. contents:: Table of Contents :local: :depth: 3 System Overview --------------- SATHI (Self-reported Assessment and Tracking for Health Insights) is a Django-based web application designed to collect, manage, and analyze Patient-Reported Outcome Measures (PROMs). The system follows a modular architecture with clear separation of concerns across multiple Django apps. **Core Capabilities:** - Multi-institutional patient management with encrypted PHI - Flexible PROM questionnaire creation and administration - Real-time construct score calculation with clinical significance detection - Population-level aggregation and comparison analytics - Multi-language support with translatable content - Role-based access control (RBAC) with institution-level data isolation **Architecture Principles:** 1. **Security First**: All PHI encrypted at rest, institution-based data isolation 2. **Modularity**: Separate apps for distinct functional domains 3. **Extensibility**: Pluggable AI services, customizable scoring equations 4. **Performance**: Lazy loading, database indexing, query optimization 5. **Internationalization**: Multi-language support via django-parler Django Apps Structure --------------------- SATHI is organized into four primary Django applications, each with specific responsibilities: patientapp ~~~~~~~~~~ **Purpose**: Patient and clinical data management **Responsibilities:** - Patient demographic information (encrypted) - Institution management and access control - Diagnosis and treatment tracking - Clinical timeline management **Key Models**: ``Patient``, ``Institution``, ``Diagnosis``, ``Treatment`` promapp ~~~~~~~ **Purpose**: PROM questionnaire and scoring engine **Responsibilities:** - Questionnaire structure and item management - Response scale definitions (Likert, Range) - Construct score calculation - Composite scoring - Multi-language translations - AI-augmented services (TTS, STT) **Key Models**: ``Questionnaire``, ``Item``, ``ConstructScale``, ``QuestionnaireSubmission`` providerapp ~~~~~~~~~~~ **Purpose**: Healthcare provider management **Responsibilities:** - Provider account management - Provider type classification - Institution assignment - Account expiry tracking **Key Models**: ``Provider``, ``ProviderType`` chaviprom ~~~~~~~~~ **Purpose**: Core project configuration and cross-cutting concerns **Responsibilities:** - Django settings and configuration - URL routing - Authentication and security middleware - Context processors - Signal handlers Technology Stack ---------------- **Backend Framework:** - **Django 5.x**: Web framework - **Python 3.13**: Programming language - **PostgreSQL**: Primary database (supports encrypted fields) **Security & Authentication:** - **django-two-factor-auth**: 2FA implementation - **django-ratelimit**: Rate limiting for security - **django-recaptcha**: Bot protection - **secured-fields**: Field-level encryption for PHI **Frontend:** - **HTMX**: Dynamic content loading without JavaScript frameworks - **TailwindCSS**: Utility-first CSS framework - **django-cotton**: Component-based templating - **Plotly/Bokeh**: Interactive data visualization **Internationalization:** - **django-parler**: Multi-language model translations - **gettext**: String translations **Performance:** - **Memcached**: Caching layer (production) - **Gunicorn**: WSGI HTTP server - **WhiteNoise**: Static file serving **Development Tools:** - **Sphinx**: Documentation generation - **django-debug-toolbar**: Development debugging Data Model Architecture ----------------------- The data model is organized into three primary domains: **Clinical Data**, **PROM Structure**, and **PROM Responses**. Clinical Data Domain ~~~~~~~~~~~~~~~~~~~~ Institution ^^^^^^^^^^^ **Purpose**: Multi-tenancy and data isolation .. code-block:: python class Institution(models.Model): id = UUIDField(primary_key=True) name = CharField(max_length=255) created_date = DateTimeField(auto_now_add=True) modified_date = DateTimeField(auto_now=True) **Relationships:** - One-to-Many with ``Patient`` - One-to-Many with ``Provider`` **Security**: All patient and provider data is filtered by institution to ensure data isolation. Patient ^^^^^^^ **Purpose**: Store patient demographic and clinical information .. code-block:: python class Patient(models.Model): id = UUIDField(primary_key=True) institution = ForeignKey(Institution) user = OneToOneField(User) # Django auth user name = EncryptedCharField(max_length=255, searchable=True) patient_id = EncryptedCharField(max_length=255, searchable=True) date_of_registration = EncryptedDateField(searchable=True) age = PositiveIntegerField(db_index=True) gender = CharField(choices=GenderChoices, db_index=True) preferred_language = CharField(choices=LANGUAGES) **Encryption**: Name, patient_id, and dates are encrypted using ``secured_fields`` **Indexes**: Composite indexes on ``(institution, gender)``, ``(institution, age)`` for filtering **Relationships:** - Belongs to ``Institution`` - Linked to Django ``User`` (1:1) - Has many ``Diagnosis`` (1:N) - Has many ``PatientQuestionnaire`` (1:N) Diagnosis ^^^^^^^^^ **Purpose**: Track patient diagnoses with ICD-11 coding .. code-block:: python class Diagnosis(models.Model): id = UUIDField(primary_key=True) patient = ForeignKey(Patient) diagnosis = ForeignKey(DiagnosisList) date_of_diagnosis = EncryptedDateField(searchable=True) **Relationships:** - Belongs to ``Patient`` - References ``DiagnosisList`` (standardized diagnosis catalog) - Has many ``Treatment`` (1:N) Treatment ^^^^^^^^^ **Purpose**: Track treatment episodes with dates and intent .. code-block:: python class Treatment(models.Model): id = UUIDField(primary_key=True) diagnosis = ForeignKey(Diagnosis) treatment_type = ManyToManyField(TreatmentType) treatment_intent = CharField(choices=TreatmentIntentChoices) date_of_start_of_treatment = EncryptedDateField() currently_ongoing_treatment = BooleanField(default=False) date_of_end_of_treatment = EncryptedDateField() **Validation**: Custom ``clean()`` method ensures: - Start date not in future - End date not before start date - Ongoing treatments have no end date - Completed treatments have end date **Relationships:** - Belongs to ``Diagnosis`` - Has many ``TreatmentType`` (M:N) PROM Structure Domain ~~~~~~~~~~~~~~~~~~~~~ ConstructScale ^^^^^^^^^^^^^^ **Purpose**: Define latent trait measurement with scoring equation .. code-block:: python class ConstructScale(models.Model): id = UUIDField(primary_key=True) name = CharField(max_length=255) # e.g., "Physical Function" instrument_name = CharField() # e.g., "EORTC QLQ-C30" instrument_version = CharField() scale_equation = CharField(max_length=1025) # e.g., "{q1} + {q2} + {q3}" minimum_number_of_items = IntegerField() scale_better_score_direction = CharField(choices=DirectionChoices) scale_threshold_score = DecimalField() scale_minimum_clinical_important_difference = DecimalField() scale_normative_score_mean = DecimalField() scale_normative_score_standard_deviation = DecimalField() **Equation Validation**: Uses Lark parser to validate mathematical equations **Clinical Parameters:** - **Threshold Score**: Clinical significance cutoff - **MCID**: Minimum clinically important difference - **Normative Score**: Population reference values (mean ± SD) **Relationships:** - Has many ``Item`` (M:N) - Referenced by ``CompositeConstructScaleScoring`` Item ^^^^ **Purpose**: Individual questions in a questionnaire .. code-block:: python class Item(TranslatableModel): id = UUIDField(primary_key=True) construct_scale = ManyToManyField(ConstructScale) translations = TranslatedFields( name = CharField(max_length=500), # Question text media = FileField(upload_to='item_media/') # Audio/video/image ) abbreviated_item_id = CharField(unique=True) item_number = IntegerField() # For equation references {q1}, {q2} response_type = CharField(choices=ResponseTypeChoices) likert_response = ForeignKey(LikertScale, null=True) range_response = ForeignKey(RangeScale, null=True) # Clinical parameters (optional, item-level) item_better_score_direction = CharField() item_threshold_score = DecimalField() item_minimum_clinical_important_difference = DecimalField() **Translation**: Question text and media translatable via django-parler **Response Types:** - **Text**: Free-form text input - **Number**: Numeric input - **Likert**: Ordered categorical scale - **Range**: Numeric slider (e.g., 0-10) - **Media**: Audio/video recording by patient **Validation**: Ensures response type matches selected scale (Likert or Range) LikertScale & LikertScaleResponseOption ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Purpose**: Define ordered categorical response scales .. code-block:: python class LikertScale(models.Model): id = UUIDField(primary_key=True) likert_scale_name = CharField() class LikertScaleResponseOption(TranslatableModel): id = UUIDField(primary_key=True) likert_scale = ForeignKey(LikertScale) option_order = IntegerField() translations = TranslatedFields( option_text = CharField(), # e.g., "Not at all", "A little" option_media = FileField() ) option_emoji = CharField() option_value = DecimalField() # Numeric value for scoring **Features:** - Viridis color palette for visual representation - Automatic text color selection based on background luminance - Translatable option text and media RangeScale ^^^^^^^^^^ **Purpose**: Define numeric slider scales .. code-block:: python class RangeScale(TranslatableModel): id = UUIDField(primary_key=True) range_scale_name = CharField() max_value = DecimalField() min_value = DecimalField() increment = DecimalField() translations = TranslatedFields( min_value_text = CharField(), # e.g., "No Pain" max_value_text = CharField() # e.g., "Worst Pain" ) **Validation**: Ensures min < max and (max - min) divisible by increment Questionnaire ^^^^^^^^^^^^^ **Purpose**: Collection of items presented to patients .. code-block:: python class Questionnaire(TranslatableModel): id = UUIDField(primary_key=True) translations = TranslatedFields( name = CharField(), description = TextField() ) questionnaire_order = IntegerField() questionnaire_answer_interval = DurationField() questionnaire_redirect = ForeignKey('self', null=True) **Relationships:** - Has many ``QuestionnaireItem`` (M:N through table with order) - Can redirect to another ``Questionnaire`` after completion QuestionnaireItem ^^^^^^^^^^^^^^^^^ **Purpose**: Junction table linking questionnaires to items with ordering .. code-block:: python class QuestionnaireItem(models.Model): questionnaire = ForeignKey(Questionnaire) item = ForeignKey(Item) item_order = IntegerField() **Features**: Allows same item to appear in multiple questionnaires with different ordering PROM Response Domain ~~~~~~~~~~~~~~~~~~~~ PatientQuestionnaire ^^^^^^^^^^^^^^^^^^^^ **Purpose**: Assignment of questionnaire to patient .. code-block:: python class PatientQuestionnaire(models.Model): id = UUIDField(primary_key=True) patient = ForeignKey(Patient) questionnaire = ForeignKey(Questionnaire) date_assigned = DateTimeField(auto_now_add=True) is_active = BooleanField(default=True) **Relationships:** - Links ``Patient`` to ``Questionnaire`` - Has many ``QuestionnaireSubmission`` (1:N) QuestionnaireSubmission ^^^^^^^^^^^^^^^^^^^^^^^ **Purpose**: Single completion of a questionnaire by a patient .. code-block:: python class QuestionnaireSubmission(models.Model): id = UUIDField(primary_key=True) patient_questionnaire = ForeignKey(PatientQuestionnaire) patient = ForeignKey(Patient, db_index=True) submission_date = DateTimeField(auto_now_add=True, db_index=True) is_complete = BooleanField(default=False) **Indexes**: Composite index on ``(patient, submission_date)`` for timeline queries **Relationships:** - Belongs to ``PatientQuestionnaire`` - Has many ``QuestionnaireItemResponse`` (1:N) - Has many ``QuestionnaireConstructScore`` (1:N) QuestionnaireItemResponse ^^^^^^^^^^^^^^^^^^^^^^^^^ **Purpose**: Patient's answer to a single item .. code-block:: python class QuestionnaireItemResponse(models.Model): id = UUIDField(primary_key=True) questionnaire_submission = ForeignKey(QuestionnaireSubmission) questionnaire_item = ForeignKey(QuestionnaireItem) response_text = TextField(null=True) response_number = DecimalField(null=True) response_likert = ForeignKey(LikertScaleResponseOption, null=True) response_range = DecimalField(null=True) response_media = FileField(null=True) response_date = DateTimeField(auto_now_add=True) **Polymorphic Storage**: Different fields for different response types QuestionnaireConstructScore ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Purpose**: Calculated construct score from item responses .. code-block:: python class QuestionnaireConstructScore(models.Model): id = UUIDField(primary_key=True) questionnaire_submission = ForeignKey(QuestionnaireSubmission) construct = ForeignKey(ConstructScale) score = DecimalField() items_answered = IntegerField() total_items = IntegerField() calculation_date = DateTimeField(auto_now_add=True) **Calculation**: Scores computed using construct's equation with item responses **Indexes**: Optimized for time-series queries on ``(construct, questionnaire_submission__patient)`` Composite & Advanced Models ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ CompositeConstructScaleScoring ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Purpose**: Combine multiple construct scores into higher-level metrics .. code-block:: python class CompositeConstructScaleScoring(models.Model): id = UUIDField(primary_key=True) composite_construct_scale_name = CharField() construct_scales = ManyToManyField(ConstructScale) scoring_type = CharField(choices=ScoringTypeChoices) # Average, Sum, Median, Mode, Min, Max **Use Case**: Create summary scores like FACT TOI (Trial Outcome Index) AIAPIConfiguration ^^^^^^^^^^^^^^^^^^ **Purpose**: Configure AI-augmented services (TTS, STT, etc.) .. code-block:: python class AIAPIConfiguration(models.Model): ai_provider = CharField() ai_capability = CharField(choices=AICapabilitiesChoices) utility_function_path = CharField() # Python import path api_url = CharField() api_key_environment_variable_name = CharField() **Capabilities**: Text-to-Speech, Speech-to-Text, Image/Video Generation **Validation**: Ensures utility function exists and is callable, API key in environment Data Flow Architecture ---------------------- Questionnaire Creation Flow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Define Construct Scale** → Set name, equation, clinical parameters 2. **Create Response Scales** → Define Likert or Range scales (if needed) 3. **Create Items** → Write questions, assign response types and scales 4. **Build Questionnaire** → Select items, set order, configure redirects 5. **Assign to Patients** → Create ``PatientQuestionnaire`` records Patient Response Flow ~~~~~~~~~~~~~~~~~~~~~ 1. **Patient Login** → 2FA authentication 2. **View Assigned Questionnaires** → Filtered by ``PatientQuestionnaire.is_active`` 3. **Answer Questions** → Create ``QuestionnaireItemResponse`` records 4. **Submit Questionnaire** → Mark ``QuestionnaireSubmission.is_complete = True`` 5. **Calculate Scores** → Generate ``QuestionnaireConstructScore`` records 6. **Trigger Signals** → Cache invalidation, audit logging Score Calculation Flow ~~~~~~~~~~~~~~~~~~~~~~ 1. **Fetch Item Responses** → Get all responses for submission 2. **Group by Construct** → Organize responses by construct scale 3. **Validate Minimum Items** → Check ``minimum_number_of_items`` met 4. **Parse Equation** → Use Lark parser to build AST 5. **Transform & Evaluate** → Substitute item values, compute result 6. **Store Score** → Save to ``QuestionnaireConstructScore`` 7. **Check Clinical Significance** → Compare to threshold, MCID, normative values Aggregation & Analytics Flow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Filter Patients** → Apply demographic filters (gender, diagnosis, treatment, age) 2. **Fetch Historical Scores** → Bulk query for all patients' construct scores 3. **Calculate Time Intervals** → Relative to start date (registration, diagnosis, treatment) 4. **Aggregate by Interval** → Compute median/mean with IQR/CI for each timepoint 5. **Generate Plots** → Create Bokeh/Plotly visualizations with population comparison 6. **Cache Results** → Store aggregated data in Memcached (1 hour TTL) Security Architecture --------------------- Data Encryption ~~~~~~~~~~~~~~~ **Field-Level Encryption** (via ``secured_fields``): - Patient name, patient_id - All clinical dates (diagnosis, treatment, registration) - Searchable encryption allows filtering without decryption **Encryption Keys**: Stored in environment variables, never in code Institution-Based Isolation ~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Access Control Pattern**: .. code-block:: python def get_accessible_patient_or_404(user, patient_pk): # Get user's institutions user_institutions = get_user_institutions(user) # Filter patient by institution patient = Patient.objects.filter( pk=patient_pk, institution__in=user_institutions ).first() if not patient: raise Http404 return patient **Enforcement**: All views use institution filtering, preventing cross-institution data access Role-Based Access Control ~~~~~~~~~~~~~~~~~~~~~~~~~~ **Django Permissions**: - ``patientapp.view_patient`` - ``patientapp.add_patient`` - ``promapp.view_questionnaire`` - ``promapp.add_constructscale`` **Permission Checks**: ``@permission_required`` decorators on all views Authentication & 2FA ~~~~~~~~~~~~~~~~~~~~ **Multi-Factor Authentication**: - TOTP (Time-based One-Time Password) - Backup codes for account recovery - Session binding to prevent token replay - Rate limiting on login attempts **Session Security**: - CSRF protection - Secure cookies (HTTPOnly, Secure, SameSite) - IP and User-Agent binding Performance Optimization ------------------------ Database Indexing ~~~~~~~~~~~~~~~~~ **Strategic Indexes**: - Composite indexes on frequently filtered combinations - Foreign key indexes on all relationships - Date indexes for timeline queries **Example**: .. code-block:: python class Meta: indexes = [ models.Index(fields=['institution', 'gender']), models.Index(fields=['patient', 'submission_date']), ] Query Optimization ~~~~~~~~~~~~~~~~~~ **Techniques**: - ``select_related()`` for forward foreign keys - ``prefetch_related()`` for reverse foreign keys and M2M - Bulk operations to reduce database round-trips - Lazy evaluation with ``only()`` and ``defer()`` **Example**: .. code-block:: python patients = Patient.objects.select_related( 'institution' ).prefetch_related( 'diagnosis_set__diagnosis', 'diagnosis_set__treatment_set__treatment_type' ) Lazy Loading with HTMX ~~~~~~~~~~~~~~~~~~~~~~ **Plot Generation**: - Main page loads without plots (1-2s) - Plots loaded on-demand via HTMX ``hx-trigger="revealed"`` - 85% faster initial page load **Benefits**: - Progressive enhancement - Reduced server load - Better user experience Caching Strategy ~~~~~~~~~~~~~~~~ **Cache Layers**: 1. **Patient-Specific Data** (5 min TTL) - Individual scores - Item responses 2. **Aggregation Data** (1 hour TTL) - Population statistics - Shared across patients with same filters 3. **Static Content** (Indefinite) - Questionnaire structure - Response scales **Cache Invalidation**: Signals on ``QuestionnaireSubmission.post_save`` Extensibility Points -------------------- Custom Scoring Equations ~~~~~~~~~~~~~~~~~~~~~~~~~ **Equation Parser**: Lark-based grammar supports: - Basic arithmetic: ``+``, ``-``, ``*``, ``/``, ``^`` - Functions: ``sqrt()``, ``abs()``, ``min()``, ``max()`` - Item references: ``{q1}``, ``{q2}``, etc. - Conditional logic (future enhancement) **Example Equations**: - EORTC: ``(1 - ({q1} + {q2}) / 8) * 100`` - Average: ``({q1} + {q2} + {q3}) / 3`` - Weighted: ``{q1} * 0.5 + {q2} * 0.3 + {q3} * 0.2`` Pluggable AI Services ~~~~~~~~~~~~~~~~~~~~~ **Architecture**: Function-based plugins via ``AIAPIConfiguration`` **Integration Pattern**: 1. Create utility function in ``promapp/ai_utils/`` 2. Register in ``AIAPIConfiguration`` with import path 3. System dynamically loads and calls function **Supported Services**: - Text-to-Speech (TTS) - Speech-to-Text (STT) - Image/Video generation Multi-Language Support ~~~~~~~~~~~~~~~~~~~~~~ **Translation Layers**: 1. **Model Translations** (django-parler) - Questionnaire names/descriptions - Item text and media - Response option text 2. **UI Translations** (gettext) - Interface strings - Help text - Error messages **Language Selection**: Per-patient preference stored in ``Patient.preferred_language`` Design Patterns --------------- Repository Pattern ~~~~~~~~~~~~~~~~~~ **Utility Functions**: Centralized data access in ``patientapp/utils.py`` and ``promapp/utils.py`` **Benefits**: - Consistent query patterns - Easier testing and mocking - Performance optimization in one place Signal-Based Architecture ~~~~~~~~~~~~~~~~~~~~~~~~~ **Django Signals**: Used for cross-cutting concerns **Examples**: - Cache invalidation on data changes - Audit logging for security events - User language preference updates **Pattern**: .. code-block:: python @receiver(post_save, sender=QuestionnaireSubmission) def invalidate_cache(sender, instance, **kwargs): cache_key = f"patient_{instance.patient_id}_scores" cache.delete(cache_key) Component-Based Templates ~~~~~~~~~~~~~~~~~~~~~~~~~ **Django Cotton**: Reusable UI components **Benefits**: - Consistent UI across application - Easier maintenance - Reduced code duplication **Example Components**: - ``c-action_buttons`` - ``c-card`` - ``c-dropdown`` Future Enhancements ------------------- **Planned Features**: 1. **Adaptive Testing** (CAT) - IRT-based item selection - Reduced patient burden 2. **Advanced Analytics** - Machine learning predictions - Trend analysis - Risk stratification 3. **Mobile App** - Native iOS/Android apps - Offline data collection - Push notifications 4. **API Layer** - RESTful API for external integrations - FHIR compliance - Webhook support **Technical Debt**: - Migrate to async views for better concurrency - Implement GraphQL for flexible data queries - Add comprehensive integration tests - Enhance caching with Redis Cluster Conclusion ---------- SATHI's architecture balances security, performance, and extensibility through: - **Modular Design**: Clear separation of concerns across Django apps - **Robust Data Model**: Comprehensive PROM structure with clinical parameters - **Security-First**: Encryption, isolation, and RBAC at every layer - **Performance**: Strategic indexing, caching, and lazy loading - **Extensibility**: Pluggable AI services, custom equations, multi-language support This architecture supports the core mission of collecting and analyzing patient-reported outcomes while maintaining the highest standards of data security and clinical utility.