From c90216d37f0a701a2f516c20abf72e68c18a4cc3 Mon Sep 17 00:00:00 2001 From: Jade Ellis Date: Thu, 19 Jun 2025 23:46:49 +0100 Subject: [PATCH] docs: Generated database docs --- docs/DATABASE_MERMAID_DIAGRAMS.md | 350 +++++++++++++++++++++ docs/DATABASE_RELATIONSHIPS.md | 354 ++++++++++++++++++++++ docs/DATABASE_SCHEMA.md | 485 ++++++++++++++++++++++++++++++ 3 files changed, 1189 insertions(+) create mode 100644 docs/DATABASE_MERMAID_DIAGRAMS.md create mode 100644 docs/DATABASE_RELATIONSHIPS.md create mode 100644 docs/DATABASE_SCHEMA.md diff --git a/docs/DATABASE_MERMAID_DIAGRAMS.md b/docs/DATABASE_MERMAID_DIAGRAMS.md new file mode 100644 index 00000000..53969429 --- /dev/null +++ b/docs/DATABASE_MERMAID_DIAGRAMS.md @@ -0,0 +1,350 @@ +# Continuwuity Database Mermaid Diagrams + +This document contains visual representations of the Continuwuity database schema using Mermaid diagrams. + +## 1. Core Event Storage Architecture + +```mermaid +graph TD + A[Matrix Event ID
48 bytes] --> B[eventid_shorteventid] + B --> C[Short Event ID
8 bytes] + C --> D[shorteventid_eventid] + D --> A + + A --> E[eventid_pduid] + E --> F[PDU ID
16 bytes] + F --> G[pduid_pdu
Main Event Storage] + + A --> H[eventid_outlierpdu
Outlier Events] + + C --> I[shorteventid_authchain
Authorization Chains] + C --> J[shorteventid_shortstatehash
Event → State Mapping] + + G -.->|Shared Cache| H + + style G fill:#e1f5fe + style H fill:#e1f5fe + style A fill:#fff3e0 + style C fill:#f3e5f5 +``` + +## 2. Room State Management System + +```mermaid +graph TD + A[Room State Key] --> B[statekey_shortstatekey] + B --> C[Short State Key
8 bytes] + C --> D[shortstatekey_statekey] + D --> A + + E[Full State Hash] --> F[statehash_shortstatehash] + F --> G[Short State Hash
8 bytes] + + G --> H[shortstatehash_statediff
State Differences] + G --> I[roomid_shortstatehash
Current Room State] + G --> J[roomsynctoken_shortstatehash
Sync Token Mapping] + + K[Room ID] --> I + L[Sync Token] --> J + + style G fill:#e8f5e8 + style I fill:#fff3e0 + style J fill:#f0f4ff +``` + +## 3. User Authentication and Identity Flow + +```mermaid +graph TD + A[User ID] --> B[userid_password
Password Hashes] + + A --> C[userid_displayname] + A --> D[userid_avatarurl] + D --> E[userid_blurhash] + A --> F[useridprofilekey_value
Custom Profile] + + G[Access Token] --> H[token_userdeviceid] + H --> I[User + Device ID] + I --> J[userdeviceid_token] + J --> G + + I --> K[userdeviceid_metadata
Device Info] + I --> L[userdevicesessionid_uiaainfo
Auth Sessions] + I --> M[userdevicetxnid_response
Transaction Cache] + + N[OpenID Token] --> O[openidtoken_expiresatuserid] + P[Login Token] --> Q[logintoken_expiresatuserid] + + style H fill:#e1f5fe + style J fill:#e1f5fe + style B fill:#ffebee +``` + +## 4. Room Membership Bidirectional System + +```mermaid +graph TD + A[Room ID + User ID] --> B[roomuserid_joined
Room → User View] + C[User ID + Room ID] --> D[userroomid_joined
User → Room View] + + B -.->|Bidirectional| D + + A --> E[roomuserid_invitecount] + C --> F[userroomid_invitestate] + E -.->|Related| F + + A --> G[roomuserid_leftcount] + C --> H[userroomid_leftstate] + G -.->|Related| H + + A --> I[roomuserid_knockedcount] + C --> J[userroomid_knockedstate] + I -.->|Related| J + + K[Room ID] --> L[roomid_joinedcount
Total Joined] + K --> M[roomid_invitedcount
Total Invited] + + N[Historical] --> O[roomuseroncejoinedids
Ever Joined Tracking] + + style B fill:#e8f5e8 + style D fill:#e8f5e8 + style L fill:#fff3e0 + style M fill:#fff3e0 +``` + +## 5. Cryptography and Key Management Chain + +```mermaid +graph TD + A[User ID] --> B[userid_devicelistversion
Device List Changes] + + A --> C[userid_masterkeyid
Master Signing Key] + A --> D[userid_selfsigningkeyid
Self Signing Key] + A --> E[userid_usersigningkeyid
User Signing Key] + + F[Key ID] --> G[keyid_key
Actual Keys] + + C --> G + D --> G + E --> G + + H[Key Change ID] --> I[keychangeid_userid
Change Notifications] + + J[One-Time Key ID] --> K[onetimekeyid_onetimekeys
OTK Storage] + A --> L[userid_lastonetimekeyupdate
Last OTK Update] + + M[Backup ID] --> N[backupid_algorithm
Backup Algorithm] + M --> O[backupid_etag
Backup Versioning] + P[Backup Key ID] --> Q[backupkeyid_backup
Backed Up Keys] + + style G fill:#e1f5fe + style I fill:#fff3e0 + style K fill:#f3e5f5 + style Q fill:#e8f5e8 +``` + +## 6. Federation and Server Communication + +```mermaid +graph TD + A[Server Name] --> B[servername_destination
Cached Destinations] + A --> C[servername_override
Cached Overrides] + A --> D[server_signingkeys
Federation Keys] + A --> E[servername_educount
EDU Counters] + + F[Server + Event] --> G[servernameevent_data
Server Events] + H[Server Current] --> I[servercurrentevent_data
Current State] + + J[Room ID] --> K[roomserverids
Room → Servers] + L[Server Name] --> M[serverroomids
Server → Rooms] + + K -.->|Bidirectional| M + + N[Room ID] --> O[roomid_inviteviaservers
Invitation Routing] + + style B fill:#e1f5fe + style C fill:#e1f5fe + style K fill:#e8f5e8 + style M fill:#e8f5e8 +``` + +## 7. Push Notifications and Read Tracking + +```mermaid +graph TD + A[Sender Key] --> B[senderkey_pusher
Push Endpoints] + C[Push Key] --> D[pushkey_deviceid
Device Mapping] + + B -.->|Related| D + + E[Read Receipt ID] --> F[readreceiptid_readreceipt
Public Receipts] + + G[Room + User] --> H[roomuserid_privateread
Private Read Markers] + G --> I[roomuserid_lastprivatereadupdate
Update Timestamps] + + J[User + Room] --> K[userroomid_highlightcount
Mention Count] + J --> L[userroomid_notificationcount
Notification Count] + + style F fill:#e8f5e8 + style H fill:#f3e5f5 + style K fill:#fff3e0 + style L fill:#fff3e0 +``` + +## 8. Media and Content Management + +```mermaid +graph TD + A[Media ID] --> B[mediaid_file
File Metadata] + A --> C[mediaid_user
Uploader Tracking] + + B -.->|Related| C + + D[URL] --> E[url_previews
Preview Cache] + + F[User ID] --> G[userfilterid_filter
Sync Filters] + H[Lazy Load] --> I[lazyloadedids
Member Event Tracking] + + style B fill:#e1f5fe + style C fill:#e1f5fe + style E fill:#f0f4ff +``` + +## 9. Account Data and Presence System + +```mermaid +graph TD + A[Room + User + Type] --> B[roomusertype_roomuserdataid
Account Data Index] + B --> C[Room User Data ID] + C --> D[roomuserdataid_accountdata
Actual Account Data] + + E[User ID] --> F[userid_presenceid
Presence Mapping] + F --> G[Presence ID] + G --> H[presenceid_presence
Presence Data] + + I[To-Device ID] --> J[todeviceid_events
Device Messages] + + style D fill:#e8f5e8 + style H fill:#f3e5f5 + style J fill:#fff3e0 +``` + +## 10. Global Configuration and Access Control + +```mermaid +graph TD + A[Global Config] --> B[global
Server Settings] + + C[Room Categories] --> D[publicroomids
Public Rooms] + C --> E[bannedroomids
Banned Rooms] + C --> F[disabledroomids
Disabled Rooms] + + G[App Service ID] --> H[id_appserviceregistrations
Application Services] + + I[Token Management] --> J[tokenids
Token Allocation] + + K[Relations] --> L[tofrom_relation
Event Relations] + K --> M[threadid_userids
Thread Participants] + K --> N[referencedevents
Referenced Events] + K --> O[softfailedeventids
Failed Events] + + style B fill:#e1f5fe + style D fill:#e8f5e8 + style E fill:#ffebee + style F fill:#ffebee +``` + +## 11. Complete System Overview + +```mermaid +graph TB + subgraph "Identity Management" + UI[User Identity] + UA[User Auth] + UD[User Devices] + UP[User Profile] + end + + subgraph "Event Storage" + ES[Event Storage] + EID[Event ID Mapping] + EO[Outlier Events] + end + + subgraph "Room Management" + RS[Room State] + RM[Room Membership] + RMeta[Room Metadata] + end + + subgraph "Cryptography" + DK[Device Keys] + CS[Cross Signing] + KB[Key Backups] + end + + subgraph "Federation" + FS[Federation Servers] + FK[Federation Keys] + FE[Federation Events] + end + + subgraph "Communication" + PUSH[Push Notifications] + RT[Read Tracking] + DM[Device Messages] + end + + subgraph "Content" + MC[Media Content] + UP2[URL Previews] + AD[Account Data] + end + + UI --> UA + UA --> UD + UI --> UP + + ES --> EID + ES --> EO + EID --> RS + + RS --> RM + RM --> RMeta + + UD --> DK + DK --> CS + CS --> KB + + RM --> FS + FS --> FK + FK --> FE + + UD --> PUSH + RM --> RT + UD --> DM + + UI --> MC + MC --> UP2 + UI --> AD + + style UI fill:#e8f5e8 + style ES fill:#e1f5fe + style RS fill:#f3e5f5 + style DK fill:#fff3e0 + style FS fill:#f0f4ff +``` + +## Diagram Legend + +- **Blue boxes** (`#e1f5fe`): Core storage tables +- **Green boxes** (`#e8f5e8`): Membership and relationship tables +- **Purple boxes** (`#f3e5f5`): ID mapping and compression tables +- **Orange boxes** (`#fff3e0`): Count and metadata tables +- **Light blue boxes** (`#f0f4ff`): Sync and federation tables +- **Red boxes** (`#ffebee`): Access control and security tables +- **Solid arrows**: Direct relationships +- **Dotted arrows**: Bidirectional or related tables +- **Shared Cache notation**: Tables that share memory pools + +These diagrams show how Continuwuity's 89 database tables interconnect to provide a complete Matrix homeserver implementation with optimized storage patterns and efficient relationship management. diff --git a/docs/DATABASE_RELATIONSHIPS.md b/docs/DATABASE_RELATIONSHIPS.md new file mode 100644 index 00000000..ab0ed1b5 --- /dev/null +++ b/docs/DATABASE_RELATIONSHIPS.md @@ -0,0 +1,354 @@ +# Continuwuity Database Column Relationships + +This document analyzes how the 89 database columns in Continuwuity relate to each other, showing the data flow and dependencies between tables. + +## Core Identity Mapping System + +### Event ID Management + +The system uses a sophisticated event ID mapping system to optimize storage: + +``` +eventid_shorteventid ←→ shorteventid_eventid + ↓ ↓ + eventid_pduid shorteventid_authchain + ↓ ↓ + pduid_pdu shorteventid_shortstatehash + ↓ +eventid_outlierpdu +``` + +**Relationships:** + +- `eventid_shorteventid` + `shorteventid_eventid`: Bidirectional mapping between full Matrix event IDs (48 bytes) and compact short IDs (8 bytes) +- `eventid_pduid`: Maps event IDs to PDU IDs (16-byte internal identifiers) +- `pduid_pdu`: Main event storage using PDU IDs as keys +- `eventid_outlierpdu`: Stores events not yet part of the timeline (outliers) +- `shorteventid_authchain`: Authorization chains using short event IDs +- `shorteventid_shortstatehash`: Links events to room state using short IDs + +### Room ID Management + +Similar optimization for room identifiers: + +``` +roomid_shortroomid + ↓ + (used in keys for room-related tables) +``` + +### State Management + +Complex state tracking with compression: + +``` +statekey_shortstatekey ←→ shortstatekey_statekey + ↓ +statehash_shortstatehash + ↓ +shortstatehash_statediff + ↓ +roomid_shortstatehash +``` + +**Relationships:** + +- `statekey_shortstatekey` + `shortstatekey_statekey`: Bidirectional mapping for state keys +- `statehash_shortstatehash`: Maps full state hashes to 8-byte compressed versions +- `shortstatehash_statediff`: Stores state differences between versions +- `roomid_shortstatehash`: Current state hash for each room + +## User and Authentication Flow + +### User Authentication Chain + +``` +userid_password → token_userdeviceid ←→ userdeviceid_token + ↓ + userdeviceid_metadata + ↓ + userdevicesessionid_uiaainfo +``` + +**Relationships:** + +- `userid_password`: Stores user password hashes +- `token_userdeviceid` + `userdeviceid_token`: Bidirectional mapping between access tokens and devices +- `userdeviceid_metadata`: Device information (name, type, etc.) +- `userdevicesessionid_uiaainfo`: User-Interactive Authentication session data + +### User Profile Data + +``` +userid_displayname +userid_avatarurl → userid_blurhash +useridprofilekey_value +``` + +**Relationships:** + +- Profile data is stored separately per attribute +- `userid_blurhash` complements `userid_avatarurl` for progressive loading + +### Token Management + +``` +openidtoken_expiresatuserid +logintoken_expiresatuserid +tokenids +``` + +**Relationships:** + +- Separate token types have separate expiration tracking +- `tokenids` manages token ID allocation + +## Room Membership System + +### Membership State Tracking + +``` +roomuserid_joined ←→ userroomid_joined +roomuserid_invitecount ←→ userroomid_invitestate +roomuserid_leftcount ←→ userroomid_leftstate +roomuserid_knockedcount ←→ userroomid_knockedstate +``` + +**Relationships:** + +- Bidirectional indexes: room→user and user→room perspectives +- Count tables track membership transitions +- State tables store membership event data + +### Room Counts and Metadata + +``` +roomid_joinedcount ← roomuserid_joined +roomid_invitedcount ← roomuserid_invitecount +roomuseroncejoinedids (historical tracking) +``` + +**Relationships:** + +- Count tables are derived from individual membership records +- Historical tracking for users who ever joined + +### Federation Integration + +``` +roomserverids ←→ serverroomids +roomid_inviteviaservers +``` + +**Relationships:** + +- Bidirectional tracking of which servers participate in which rooms +- Via servers for invitation routing + +## Cryptography and Security + +### Device Key Management + +``` +userid_devicelistversion + ↓ +keyid_key ← userid_masterkeyid + ↓ userid_selfsigningkeyid +keychangeid_userid ← userid_usersigningkeyid + ↓ +onetimekeyid_onetimekeys + ↓ +userid_lastonetimekeyupdate +``` + +**Relationships:** + +- Device list versions track changes requiring key updates +- Different key types stored separately with references from user records +- Key changes trigger notifications +- One-time keys managed with update timestamps + +### Key Backup System + +``` +backupid_algorithm +backupid_etag → backupkeyid_backup +``` + +**Relationships:** + +- Backup metadata (algorithm, versioning) linked to actual backed-up keys + +## Push Notifications and Read Tracking + +### Push Infrastructure + +``` +senderkey_pusher ←→ pushkey_deviceid +``` + +**Relationships:** + +- Bidirectional mapping between push keys and devices + +### Read Receipt System + +``` +readreceiptid_readreceipt +roomuserid_privateread +roomuserid_lastprivatereadupdate +userroomid_highlightcount +userroomid_notificationcount +``` + +**Relationships:** + +- Public read receipts vs private read markers +- Highlight/notification counts per user-room pair +- Update tracking for private reads + +## Media and Content + +### Media Storage + +``` +mediaid_file ←→ mediaid_user +url_previews +``` + +**Relationships:** + +- File metadata linked to uploader tracking +- URL previews cached separately + +## Sync and Timeline + +### Sync Token Management + +``` +roomsynctoken_shortstatehash +lazyloadedids +``` + +**Relationships:** + +- Sync tokens map to room state for efficient delta computation +- Lazy loading tracking for member events + +### Event Relations + +``` +tofrom_relation +threadid_userids +referencedevents +softfailedeventids +``` + +**Relationships:** + +- Event relations track replies, edits, reactions +- Thread participant tracking +- Referenced events and soft failures + +## Federation and Server Management + +### Server Discovery and Communication + +``` +servername_destination (cached) +servername_override (cached) +server_signingkeys +servername_educount +servercurrentevent_data +servernameevent_data +``` + +**Relationships:** + +- Destination resolution with caching +- Server signing keys for federation +- EDU (Ephemeral Data Unit) counting +- Current and historical server events + +## Account Data and Presence + +### Account Data Storage + +``` +roomusertype_roomuserdataid → roomuserdataid_accountdata +userid_presenceid → presenceid_presence +``` + +**Relationships:** + +- Account data indexed by room+user+type, pointing to actual data +- Presence data separated from user records with ID mapping + +## Global Configuration + +### Application Services + +``` +id_appserviceregistrations +``` + +### Global Settings + +``` +global +publicroomids +bannedroomids +disabledroomids +``` + +**Relationships:** + +- Global server configuration +- Room access control lists + +## Performance Optimizations + +### Shared Cache Relationships + +- `eventid_outlierpdu` and `pduid_pdu` share cache because they both store PDU data +- Related tables are grouped for memory efficiency + +### Transaction Management + +``` +userdevicetxnid_response +todeviceid_events +``` + +**Relationships:** + +- Transaction ID response caching +- To-device message queuing + +## Data Flow Examples + +### Sending a Message + +1. `pduid_pdu` ← stores the PDU +2. `eventid_pduid` ← maps event ID to PDU ID +3. `eventid_shorteventid` ← creates short ID mapping +4. `shorteventid_shortstatehash` ← links to room state +5. `userroomid_notificationcount` ← updates notification counts +6. `readreceiptid_readreceipt` ← processes read receipts + +### User Login + +1. `userid_password` ← validates credentials +2. `userdeviceid_token` ← creates device token +3. `token_userdeviceid` ← creates reverse mapping +4. `userdeviceid_metadata` ← stores device info + +### Room Join + +1. `roomuserid_joined` ← records membership +2. `userroomid_joined` ← creates reverse index +3. `roomid_joinedcount` ← updates room count +4. `roomuseroncejoinedids` ← historical tracking +5. `roomserverids` ← federation tracking + +This relational structure allows Continuwuity to efficiently handle Matrix protocol operations while maintaining data consistency and enabling fast lookups from multiple perspectives. diff --git a/docs/DATABASE_SCHEMA.md b/docs/DATABASE_SCHEMA.md new file mode 100644 index 00000000..f7391c8f --- /dev/null +++ b/docs/DATABASE_SCHEMA.md @@ -0,0 +1,485 @@ +# Continuwuity Database Schema Documentation + +Continuwuity is a Matrix protocol implementation using RocksDB as its storage backend. The database is organized into column families (called "Maps" in the codebase), each serving specific purposes in the Matrix homeserver functionality. + +| Table Name | Access Pattern | Key Size | Value Size | Description | +|------------|---------------|----------|------------|-------------| +| `alias_roomid` | RANDOM_SMALL | - | - | Maps room alias to room ID | +| `alias_userid` | RANDOM_SMALL | - | - | Maps room alias to user ID | +| `aliasid_alias` | RANDOM_SMALL | - | - | Maps alias ID to alias string | +| `backupid_algorithm` | RANDOM_SMALL | - | - | Key backup algorithms | +| `backupid_etag` | RANDOM_SMALL | - | - | Key backup ETags | +| `backupkeyid_backup` | RANDOM_SMALL | - | - | Backed up keys | +| `bannedroomids` | RANDOM_SMALL | - | - | Set of banned room IDs | +| `disabledroomids` | RANDOM_SMALL | - | - | Set of disabled room IDs | +| `eventid_outlierpdu` | RANDOM | 48 bytes | 1488 bytes | Outlier PDUs (shared cache with pduid_pdu) | +| `eventid_pduid` | RANDOM | 48 bytes | 16 bytes | Event ID to PDU ID mapping | +| `eventid_shorteventid` | RANDOM | 48 bytes | 8 bytes | Event ID to short event ID | +| `global` | RANDOM_SMALL | - | - | Global server configuration | +| `id_appserviceregistrations` | RANDOM_SMALL | - | - | Application service registrations | +| `keychangeid_userid` | RANDOM | - | - | Key change notifications | +| `keyid_key` | RANDOM_SMALL | - | - | Cryptographic keys | +| `lazyloadedids` | RANDOM_SMALL | - | - | Lazy-loaded member events | +| `mediaid_file` | RANDOM_SMALL | - | - | Media file metadata | +| `mediaid_user` | RANDOM_SMALL | - | - | Media uploader tracking | +| `onetimekeyid_onetimekeys` | RANDOM_SMALL | - | - | One-time keys | +| `pduid_pdu` | SEQUENTIAL | 16 bytes | 1520 bytes | Main PDU storage (shared cache with eventid_outlierpdu) | +| `publicroomids` | RANDOM_SMALL | - | - | Public room IDs | +| `pushkey_deviceid` | RANDOM_SMALL | - | - | Push key to device mapping | +| `presenceid_presence` | SEQUENTIAL_SMALL | - | - | User presence data | +| `readreceiptid_readreceipt` | RANDOM | - | - | Read receipts | +| `referencedevents` | RANDOM | - | - | Referenced events | +| `roomid_invitedcount` | RANDOM_SMALL | - | - | Room invited user count | +| `roomid_inviteviaservers` | RANDOM_SMALL | - | - | Room invite via servers | +| `roomid_joinedcount` | RANDOM_SMALL | - | - | Room joined user count | +| `roomid_pduleaves` | RANDOM_SMALL | - | - | PDU leaves per room | +| `roomid_shortroomid` | RANDOM_SMALL | - | 8 bytes | Room ID to short room ID | +| `roomid_shortstatehash` | RANDOM_SMALL | - | 8 bytes | Room ID to state hash | +| `roomserverids` | RANDOM_SMALL | - | - | Server IDs per room | +| `roomsynctoken_shortstatehash` | SEQUENTIAL | - | 8 bytes | Sync token to state hash (special compression) | +| `roomuserdataid_accountdata` | RANDOM_SMALL | - | - | Room account data | +| `roomuserid_invitecount` | RANDOM_SMALL | - | 8 bytes | Room-user invite count | +| `roomuserid_joined` | RANDOM_SMALL | - | - | Room-user joined status | +| `roomuserid_lastprivatereadupdate` | RANDOM_SMALL | - | - | Last private read update | +| `roomuserid_leftcount` | RANDOM | - | 8 bytes | Room-user leave count | +| `roomuserid_knockedcount` | RANDOM_SMALL | - | 8 bytes | Room-user knock count | +| `roomuserid_privateread` | RANDOM_SMALL | - | - | Private read markers | +| `roomuseroncejoinedids` | RANDOM | - | - | Users who ever joined | +| `roomusertype_roomuserdataid` | RANDOM_SMALL | - | - | Account data type mapping | +| `senderkey_pusher` | RANDOM_SMALL | - | - | Push notification senders | +| `server_signingkeys` | RANDOM | - | - | Server signing keys | +| `servercurrentevent_data` | RANDOM_SMALL | - | - | Current server events | +| `servername_destination` | RANDOM_SMALL_CACHE | - | - | Server destinations (cached) | +| `servername_educount` | RANDOM_SMALL | - | - | EDU counters | +| `servername_override` | RANDOM_SMALL_CACHE | - | - | Server name overrides (cached) | +| `servernameevent_data` | RANDOM | - | 128 bytes | Server event data | +| `serverroomids` | RANDOM_SMALL | - | - | Rooms per server | +| `shorteventid_authchain` | SEQUENTIAL | 8 bytes | - | Event authorization chains | +| `shorteventid_eventid` | SEQUENTIAL_SMALL | 8 bytes | 48 bytes | Short event ID to event ID | +| `shorteventid_shortstatehash` | SEQUENTIAL | 8 bytes | 8 bytes | Event to state hash mapping | +| `shortstatehash_statediff` | SEQUENTIAL_SMALL | 8 bytes | - | State differences | +| `shortstatekey_statekey` | RANDOM_SMALL | 8 bytes | 1016 bytes | Short state key to state key | +| `softfailedeventids` | RANDOM_SMALL | 48 bytes | - | Soft-failed events | +| `statehash_shortstatehash` | RANDOM | - | 8 bytes | State hash to short hash | +| `statekey_shortstatekey` | RANDOM | 1016 bytes | 8 bytes | State key to short key | +| `threadid_userids` | SEQUENTIAL_SMALL | - | - | Thread participants | +| `todeviceid_events` | RANDOM | - | - | To-device messages | +| `tofrom_relation` | RANDOM_SMALL | 8 bytes | 8 bytes | Event relations | +| `token_userdeviceid` | RANDOM_SMALL | - | - | Token to device mapping | +| `tokenids` | RANDOM | - | - | Token ID management | +| `url_previews` | RANDOM | - | - | URL preview cache | +| `userdeviceid_metadata` | RANDOM_SMALL | - | - | Device metadata | +| `userdeviceid_token` | RANDOM_SMALL | - | - | Device tokens | +| `userdevicesessionid_uiaainfo` | RANDOM_SMALL | - | - | UIAA session info | +| `userdevicetxnid_response` | RANDOM_SMALL | - | - | Transaction responses | +| `userfilterid_filter` | RANDOM_SMALL | - | - | User sync filters | +| `userid_avatarurl` | RANDOM_SMALL | - | - | User avatar URLs | +| `userid_blurhash` | RANDOM_SMALL | - | - | Avatar blurhashes | +| `userid_devicelistversion` | RANDOM_SMALL | - | - | Device list versions | +| `userid_displayname` | RANDOM_SMALL | - | - | User display names | +| `userid_lastonetimekeyupdate` | RANDOM_SMALL | - | - | Last OTK update time | +| `userid_masterkeyid` | RANDOM_SMALL | - | - | Master signing keys | +| `userid_password` | RANDOM | - | - | Password hashes | +| `userid_presenceid` | RANDOM_SMALL | - | - | User presence mapping | +| `userid_selfsigningkeyid` | RANDOM_SMALL | - | - | Self-signing keys | +| `userid_usersigningkeyid` | RANDOM_SMALL | - | - | User-signing keys | +| `useridprofilekey_value` | RANDOM_SMALL | - | - | Custom profile fields | +| `openidtoken_expiresatuserid` | RANDOM_SMALL | - | - | OpenID tokens | +| `logintoken_expiresatuserid` | RANDOM_SMALL | - | - | Login tokens | +| `userroomid_highlightcount` | RANDOM | - | - | Highlight counts | +| `userroomid_invitestate` | RANDOM_SMALL | - | - | User invite states | +| `userroomid_joined` | RANDOM | - | - | User joined rooms | +| `userroomid_leftstate` | RANDOM | - | - | User leave states | +| `userroomid_knockedstate` | RANDOM_SMALL | - | - | User knock states | +| `userroomid_notificationcount` | RANDOM | - | - | Notification counts | + +## Access Pattern Definitions + +### RANDOM + +- Large datasets with random updates across keyspace +- Compaction priority: OldestSmallestSeqFirst +- Write buffer: 32MB +- Cache shards: 128 +- Compression: Zstd level -3 +- Bottommost compression: level 2 + +### SEQUENTIAL + +- Large datasets with append-heavy updates +- Compaction priority: OldestLargestSeqFirst +- Write buffer: 64MB +- Level size: 32MB +- File size: 2MB +- Compression: Zstd level -2 + +### RANDOM_SMALL + +- Small datasets with random updates +- Compaction style: Universal +- Write buffer: 16MB +- Level size: 512KB +- File size: 128KB +- Block size: 512 bytes +- Compression: Zstd level -4 + +### SEQUENTIAL_SMALL + +- Small datasets with sequential updates +- Compaction style: Universal +- Write buffer: 16MB +- Level size: 1MB +- File size: 512KB +- Compression: Zstd level -4 + +### RANDOM_SMALL_CACHE + +- Small persistent caches with TTL +- Compaction style: FIFO +- Size limit: 64MB +- TTL: 14 days +- Unique cache allocation + +## Special Configurations + +### Shared Cache Tables + +- `eventid_outlierpdu` and `pduid_pdu` share cache pool +- Optimizes memory usage for related event data + +### High-Performance Tables + +- `roomsynctoken_shortstatehash`: Special compression settings for sync performance +- `pduid_pdu`: Large block size (2KB) for efficient event storage +- `eventid_outlierpdu`: Optimized for outlier PDU handling + +### Cache-Only Tables + +- `servername_destination`: FIFO cache for server resolution +- `servername_override`: FIFO cache for server overrides + +## Data Types and Sizes + +### Event IDs + +- Full event IDs: 48 bytes (Matrix event ID format) +- Short event IDs: 8 bytes (internal optimization) + +### Room IDs + +- Full room IDs: Variable length Matrix room ID +- Short room IDs: 8 bytes (internal optimization) + +### PDU Data + +- PDU ID: 16 bytes +- PDU content: ~1520 bytes average +- Outlier PDUs: ~1488 bytes average + +### State Data + +- State keys: Up to 1016 bytes +- Short state keys: 8 bytes +- State hashes: 8 bytes (shortened) + +This technical reference shows how Continuwuity optimizes storage for different types of Matrix data, using appropriate RocksDB configurations for each access pattern. + +## Database Architecture + +## Column Families (Maps) + +### Room Management + +#### Room Aliases + +- **`alias_roomid`**: Maps room alias to room ID +- **`alias_userid`**: Maps room alias to user ID (for alias management) +- **`aliasid_alias`**: Maps alias ID to actual alias string + +#### Room Metadata + +- **`roomid_shortroomid`**: Maps room ID to short room ID (8-byte identifier) +- **`roomid_shortstatehash`**: Maps room ID to current state hash +- **`roomid_pduleaves`**: Tracks PDU leaves for each room +- **`roomid_invitedcount`**: Count of invited users per room +- **`roomid_joinedcount`**: Count of joined users per room +- **`roomid_inviteviaservers`**: Via servers for room invites +- **`publicroomids`**: Set of public room IDs +- **`bannedroomids`**: Set of banned room IDs +- **`disabledroomids`**: Set of disabled room IDs + +#### Room State + +- **`shortstatehash_statediff`**: State differences between state hashes +- **`statehash_shortstatehash`**: Maps full state hash to short state hash (8-byte) +- **`statekey_shortstatekey`**: Maps state key to short state key (8-byte) +- **`shortstatekey_statekey`**: Reverse mapping from short state key to full state key +- **`roomsynctoken_shortstatehash`**: Maps room sync tokens to state hashes + +### Events and Timeline + +#### Event Storage + +- **`eventid_pduid`**: Maps event ID to PDU ID (16-byte identifier) +- **`eventid_shorteventid`**: Maps event ID to short event ID (8-byte) +- **`eventid_outlierpdu`**: Stores outlier PDUs (events not yet in timeline) +- **`pduid_pdu`**: Main PDU storage (PDU ID to PDU data) +- **`shorteventid_eventid`**: Reverse mapping from short event ID to full event ID +- **`shorteventid_authchain`**: Authorization chains for events +- **`shorteventid_shortstatehash`**: Maps events to their state hashes + +#### Event Relationships + +- **`tofrom_relation`**: Event relations (replies, edits, reactions) +- **`threadid_userids`**: Thread participants tracking +- **`referencedevents`**: Referenced events tracking +- **`softfailedeventids`**: Events that soft-failed state resolution + +### User Management + +#### User Identity + +- **`userid_displayname`**: User display names +- **`userid_avatarurl`**: User avatar URLs +- **`userid_blurhash`**: Avatar blurhash values +- **`userid_password`**: Password hashes +- **`useridprofilekey_value`**: Custom profile fields + +#### User Devices and Sessions + +- **`userdeviceid_metadata`**: Device metadata (name, type, etc.) +- **`userdeviceid_token`**: Device access tokens +- **`token_userdeviceid`**: Reverse token to device mapping +- **`userdevicesessionid_uiaainfo`**: User-Interactive Auth session data +- **`userdevicetxnid_response`**: Transaction ID to response caching + +#### User Preferences + +- **`userfilterid_filter`**: Sync filter definitions +- **`lazyloadedids`**: Lazy-loaded member event tracking + +### Cryptography and Security + +#### Device Keys + +- **`keyid_key`**: Cryptographic keys storage +- **`userid_devicelistversion`**: Device list versions for users +- **`userid_lastonetimekeyupdate`**: Last one-time key update timestamps +- **`onetimekeyid_onetimekeys`**: One-time keys storage + +#### Cross-Signing + +- **`userid_masterkeyid`**: Master signing keys +- **`userid_selfsigningkeyid`**: Self-signing keys +- **`userid_usersigningkeyid`**: User-signing keys +- **`keychangeid_userid`**: Key change notifications + +#### Key Backups + +- **`backupid_algorithm`**: Backup algorithm information +- **`backupid_etag`**: Backup ETags for versioning +- **`backupkeyid_backup`**: Backed up keys + +### Room Membership + +#### Membership States + +- **`roomuserid_joined`**: Current joined room members +- **`roomuserid_invitecount`**: Invite counts per room-user +- **`roomuserid_leftcount`**: Leave counts per room-user +- **`roomuserid_knockedcount`**: Knock counts per room-user +- **`roomuseroncejoinedids`**: Users who have ever joined rooms + +#### Membership Events + +- **`userroomid_joined`**: User's joined rooms +- **`userroomid_invitestate`**: Invite state events +- **`userroomid_leftstate`**: Leave state events +- **`userroomid_knockedstate`**: Knock state events + +### Push Notifications and Read Receipts + +#### Push Infrastructure + +- **`senderkey_pusher`**: Push notification endpoints +- **`pushkey_deviceid`**: Push key to device mappings + +#### Read Tracking + +- **`readreceiptid_readreceipt`**: Read receipt storage +- **`roomuserid_privateread`**: Private read markers +- **`roomuserid_lastprivatereadupdate`**: Last private read updates +- **`userroomid_highlightcount`**: Highlight/mention counts +- **`userroomid_notificationcount`**: Notification counts per room + +### Media and Content + +#### Media Storage + +- **`mediaid_file`**: Media file metadata +- **`mediaid_user`**: Media uploader tracking +- **`url_previews`**: URL preview cache + +### Federation and Server-to-Server + +#### Server Management + +- **`server_signingkeys`**: Server signing keys +- **`servername_destination`**: Server destination resolution +- **`servername_educount`**: Ephemeral Data Unit counters +- **`servername_override`**: Server name overrides for federation +- **`servernameevent_data`**: Server event data +- **`roomserverids`**: Servers participating in rooms +- **`serverroomids`**: Rooms per server +- **`servercurrentevent_data`**: Current server event state + +### Application Services + +- **`id_appserviceregistrations`**: Application service registrations + +### Account Data and Presence + +#### Account Data + +- **`roomuserdataid_accountdata`**: Room-specific account data +- **`roomusertype_roomuserdataid`**: Account data type mappings + +#### Presence + +- **`presenceid_presence`**: User presence information +- **`userid_presenceid`**: User to presence ID mapping + +### To-Device Messages + +- **`todeviceid_events`**: Direct device-to-device messages + +### Authentication Tokens + +- **`openidtoken_expiresatuserid`**: OpenID Connect tokens +- **`logintoken_expiresatuserid`**: Login tokens +- **`tokenids`**: Token ID management + +### Global Configuration + +- **`global`**: Global server settings and state + +## Key Design Patterns + +### Short Identifiers + +Many tables use "short" versions of identifiers (8-byte integers) to reduce storage overhead: + +- `shortroomid` for room IDs +- `shorteventid` for event IDs +- `shortstatekey` for state keys +- `shortstatehash` for state hashes + +### Composite Keys + +Key naming follows a pattern of `{primary}_{secondary}` to create efficient lookups: + +- `roomuserid_*` for room-user relationships +- `userroomid_*` for user-room relationships +- `eventid_*` for event-related data + +### Performance Optimizations + +- **Cache sharing**: Related tables share cache pools (e.g., `eventid_outlierpdu` and `pduid_pdu`) +- **Access patterns**: Tables are optimized for their specific usage (RANDOM vs SEQUENTIAL) +- **Compression**: Different compression levels based on data characteristics +- **Block sizes**: Tuned based on expected key/value sizes + +## Storage Efficiency + +The schema is designed for efficiency in a Matrix homeserver context: + +- Large event data uses sequential storage patterns +- Lookup tables use random access patterns +- Small metadata uses compressed storage +- Caching is strategically shared between related data + +This design allows Continuwuity to efficiently handle the complex relationships and high-volume data typical in Matrix federation while maintaining good performance characteristics for both reads and writes. + +## Column Relationships and Data Flow + +### Core Event Storage Chain + +The heart of the Matrix homeserver is event storage, which uses several interconnected tables: + +- `eventid_shorteventid` ↔ `shorteventid_eventid`: Bidirectional mapping for event ID compression (48 bytes → 8 bytes) +- `eventid_pduid`: Maps Matrix event IDs to internal PDU IDs (16 bytes) +- `pduid_pdu`: Main event storage using PDU IDs as keys (shares cache with `eventid_outlierpdu`) +- `eventid_outlierpdu`: Stores events not yet integrated into the timeline +- `shorteventid_authchain`: Authorization chains using compressed event IDs +- `shorteventid_shortstatehash`: Links events to room state snapshots + +### Room State Management + +Room state is tracked through multiple interconnected tables: + +- `statekey_shortstatekey` ↔ `shortstatekey_statekey`: Bidirectional state key compression +- `statehash_shortstatehash`: Compresses state hashes from full size to 8 bytes +- `shortstatehash_statediff`: Stores incremental state changes +- `roomid_shortstatehash`: Current state hash for each room +- `roomsynctoken_shortstatehash`: Maps sync tokens to state for efficient delta sync + +### User Identity and Authentication + +User management involves several related tables: + +- `userid_password` → authentication base +- `token_userdeviceid` ↔ `userdeviceid_token`: Bidirectional token↔device mapping +- `userdeviceid_metadata`: Device information storage +- `userid_displayname`, `userid_avatarurl`, `userid_blurhash`: Profile data +- `openidtoken_expiresatuserid`, `logintoken_expiresatuserid`: Token management + +### Room Membership Tracking + +Membership uses bidirectional indexes for efficient queries: + +- `roomuserid_joined` ↔ `userroomid_joined`: Current membership from both perspectives +- `roomuserid_invitecount` ↔ `userroomid_invitestate`: Invitation tracking +- `roomuserid_leftcount` ↔ `userroomid_leftstate`: Leave event tracking +- `roomid_joinedcount`, `roomid_invitedcount`: Aggregate room statistics +- `roomuseroncejoinedids`: Historical membership tracking + +### Cryptography and Security Chain + +End-to-end encryption involves coordinated key management: + +- `userid_devicelistversion`: Tracks when device lists change +- `keyid_key`: Stores actual cryptographic keys +- `userid_masterkeyid`, `userid_selfsigningkeyid`, `userid_usersigningkeyid`: Cross-signing keys +- `onetimekeyid_onetimekeys` → `userid_lastonetimekeyupdate`: One-time key lifecycle +- `keychangeid_userid`: Key change notifications +- `backupid_algorithm`, `backupid_etag` → `backupkeyid_backup`: Key backup system + +### Federation and Server Communication + +Server-to-server communication requires coordinated tracking: + +- `roomserverids` ↔ `serverroomids`: Bidirectional room↔server participation +- `servername_destination`, `servername_override`: Server resolution (both cached) +- `server_signingkeys`: Federation authentication +- `servername_educount`: Ephemeral data unit tracking +- `servernameevent_data`, `servercurrentevent_data`: Server event state + +### Read Tracking and Notifications + +Message read tracking involves multiple coordinated updates: + +- `readreceiptid_readreceipt`: Public read receipts +- `roomuserid_privateread`, `roomuserid_lastprivatereadupdate`: Private read markers +- `userroomid_highlightcount`, `userroomid_notificationcount`: Per-room notification counts +- `senderkey_pusher` ↔ `pushkey_deviceid`: Push notification routing + +### Account Data and Preferences + +User preferences and account data use a two-level structure: + +- `roomusertype_roomuserdataid` → `roomuserdataid_accountdata`: Type index points to actual data +- `userid_presenceid` → `presenceid_presence`: Presence data separation +- `userfilterid_filter`: Sync filter definitions +- `lazyloadedids`: Lazy loading state tracking + +This interconnected design allows Continuwuity to efficiently handle Matrix protocol operations while maintaining data consistency and enabling fast lookups from multiple perspectives.