Data Models
Core Data Model
This ER diagram shows the relationships between core entities in the NeuraScale system.
Entity Details
Core Entities:
| Entity | Description | Primary Storage |
|---|---|---|
| User | System users (researchers, clinicians) | PostgreSQL |
| Device | Neural recording devices | PostgreSQL |
| Session | Recording sessions | PostgreSQL |
| Recording | Continuous data from a device | PostgreSQL + Bigtable |
| DataChunk | Compressed time-series segments | Bigtable |
| Sample | Individual data points | Bigtable |
| Feature | Extracted features from recordings | BigQuery |
| Classification | ML model predictions | BigQuery |
| Event | Session events and markers | PostgreSQL |
| DeviceStatus | Real-time device metrics | Redis + PostgreSQL |
Data Volume Estimates:
- Users: ~1,000
- Devices: ~10,000
- Sessions: ~100,000/month
- Recordings: ~1M/month
- Samples: ~100B/month (at 250Hz)
- Features: ~10M/month
- Classifications: ~10M/month
Clinical Data Model
This ER diagram shows the HIPAA-compliant clinical data model for patient management and medical records.
Clinical Entities
Clinical Data Entities:
| Entity | Description | Encryption |
|---|---|---|
| Patient | De-identified patient records | PII encrypted |
| Clinician | Licensed healthcare providers | Credentials encrypted |
| Consent | Patient consent records | Digitally signed |
| ClinicalSession | Medical recording sessions | Audit trail |
| ClinicalRecording | Medical-grade neural recordings | Encrypted at rest |
| ClinicalAnnotation | Clinical observations and notes | Encrypted |
| Biomarker | Extracted clinical metrics | Reference ranges |
| Report | Clinical reports and findings | Encrypted, versioned |
| ReportDistribution | Report access tracking | Audit log |
Clinical Workflows:
- Patient registration with consent
- Pre-session impedance checks
- Recording with clinical annotations
- Automated biomarker extraction
- Report generation and distribution
- Follow-up scheduling
Time-Series Data Model
This section details the optimized data model for high-frequency neural signal storage and retrieval.
Storage Strategy
Time-Series Storage Strategy:
| Storage Tier | Retention | Access Speed | Cost | Use Case |
|---|---|---|---|---|
| Hot | 24 hours | < 1ms | $$$ | Real-time streaming |
| Warm | 30 days | < 10ms | $$ | Recent analysis |
| Cold | Unlimited | < 1s | $ | Long-term archive |
Data Characteristics:
- Sample rate: 250-1000 Hz
- Channels: 1-256
- Data type: Float32 (4 bytes)
- Raw throughput: Up to 1 MB/s per device
- Compression ratio: 3:1 to 5:1
Optimization Techniques:
- Delta encoding for sequential samples
- Chunking for parallel processing
- Column-oriented storage for channels
- Bloom filters for quick existence checks
- Pre-computed aggregations
Last updated on