Data Models
Core Data Model
This ER diagram shows the relationships between core entities in the NeuraScale system.
Entity Details
Core Entities:
Entity | Description | Primary Storage |
---|---|---|
User | System users (researchers, clinicians) | PostgreSQL |
Device | Neural recording devices | PostgreSQL |
Session | Recording sessions | PostgreSQL |
Recording | Continuous data from a device | PostgreSQL + Bigtable |
DataChunk | Compressed time-series segments | Bigtable |
Sample | Individual data points | Bigtable |
Feature | Extracted features from recordings | BigQuery |
Classification | ML model predictions | BigQuery |
Event | Session events and markers | PostgreSQL |
DeviceStatus | Real-time device metrics | Redis + PostgreSQL |
Data Volume Estimates:
- Users: ~1,000
- Devices: ~10,000
- Sessions: ~100,000/month
- Recordings: ~1M/month
- Samples: ~100B/month (at 250Hz)
- Features: ~10M/month
- Classifications: ~10M/month
Clinical Data Model
This ER diagram shows the HIPAA-compliant clinical data model for patient management and medical records.
Clinical Entities
Clinical Data Entities:
Entity | Description | Encryption |
---|---|---|
Patient | De-identified patient records | PII encrypted |
Clinician | Licensed healthcare providers | Credentials encrypted |
Consent | Patient consent records | Digitally signed |
ClinicalSession | Medical recording sessions | Audit trail |
ClinicalRecording | Medical-grade neural recordings | Encrypted at rest |
ClinicalAnnotation | Clinical observations and notes | Encrypted |
Biomarker | Extracted clinical metrics | Reference ranges |
Report | Clinical reports and findings | Encrypted, versioned |
ReportDistribution | Report access tracking | Audit log |
Clinical Workflows:
- Patient registration with consent
- Pre-session impedance checks
- Recording with clinical annotations
- Automated biomarker extraction
- Report generation and distribution
- Follow-up scheduling
Time-Series Data Model
This section details the optimized data model for high-frequency neural signal storage and retrieval.
Storage Strategy
Time-Series Storage Strategy:
Storage Tier | Retention | Access Speed | Cost | Use Case |
---|---|---|---|---|
Hot | 24 hours | < 1ms | $$$ | Real-time streaming |
Warm | 30 days | < 10ms | $$ | Recent analysis |
Cold | Unlimited | < 1s | $ | Long-term archive |
Data Characteristics:
- Sample rate: 250-1000 Hz
- Channels: 1-256
- Data type: Float32 (4 bytes)
- Raw throughput: Up to 1 MB/s per device
- Compression ratio: 3:1 to 5:1
Optimization Techniques:
- Delta encoding for sequential samples
- Chunking for parallel processing
- Column-oriented storage for channels
- Bloom filters for quick existence checks
- Pre-computed aggregations
Last updated on