7.1 KiB
NoteLett Cosmos Data Operations
Date: May 5, 2026
Product ID: notelett
Source of truth: backend/src/lib/cosmos-init.ts
Scope
This document records the production data layout, index expectations, retention policy, and backup/restore approach for NoteLett Cosmos DB. It complements docs/COSMOS_QUERY_REVIEW.md, which focuses on query shape and scope isolation.
Container Inventory
Every production document must include productId: "notelett" and the appropriate user/workspace ownership field. The backend registers these containers through the common-platform @bytelyst/cosmos helpers.
| Container | Partition key | Primary owner/scope | Main documents |
|---|---|---|---|
notes |
/workspaceId |
user + workspace | Note body/status/tags/links/embedding metadata |
workspaces |
/userId |
user | Workspace metadata |
note_relationships |
/workspaceId |
user + workspace | Typed note links |
note_tasks |
/workspaceId |
user + workspace | Extracted or user-created tasks |
note_artifacts |
/workspaceId |
user + workspace | Blob/artifact metadata |
note_agent_actions |
/workspaceId |
user + workspace | MCP/agent audit and review records |
saved_views |
/userId |
user | Saved search/filter views |
note_prompts |
/userId |
user or __builtin__ |
Prompt templates |
note_prompt_schedules |
/userId |
user | Scheduled prompt actions |
note_prompt_webhooks |
/userId |
user | Prompt webhook triggers |
note_shares |
/workspaceId |
user + workspace | Expiring public share links |
note_versions |
/workspaceId |
user + workspace | Note version history |
note_intake_rules |
/userId |
user | URL intake routing rules |
note_intake_jobs |
/userId |
user | Intake job lifecycle state |
note_collaborators |
/sharedWithUserId |
shared-with user | Direct note collaborators |
palace_wings |
/userId |
user | Palace wing taxonomy |
palace_rooms |
/userId |
user | Palace rooms |
palace_memories |
/userId |
user | Palace memories |
palace_tunnels |
/userId |
user | Palace tunnels |
palace_kg |
/userId |
user | Knowledge graph triples |
palace_diaries |
/userId |
user | Palace diary entries |
Index Expectations
The current registration code provides partition keys only. Unless the production account overrides indexing policies out-of-band, Cosmos default automatic indexing applies.
Keep these expectations:
- Point reads and workspace/user scoped list reads should rely on partition key plus
id,workspaceId,userId, andproductId. - Common sort fields such as
updatedAt,createdAt,state,status,priority, andexpiresAtshould remain indexed. - Do not exclude encrypted text fields from indexing until query coverage proves they are never filtered or sorted. Encrypted values are not useful for semantic search, but repository code still needs predictable reads during migration periods.
- Public share token lookup is intentionally bounded but cross-partition; if volume grows, add a user- or token-partitioned projection before relaxing query limits.
- Global dashboard/search reads over
notesare acceptable for release 1 but should move to a user-partitioned search projection or external search service if high traffic appears.
Recommended future explicit composite indexes, if query diagnostics show RU pressure:
| Container | Candidate composite index | Reason |
|---|---|---|
notes |
/productId, /userId, /workspaceId, /updatedAt DESC |
Workspace note lists and exports |
note_agent_actions |
/productId, /userId, /state, /updatedAt DESC |
Approval queue |
note_intake_jobs |
/productId, /userId, /status, /startedAt DESC |
Active intake polling |
note_versions |
/productId, /workspaceId, /noteId, /createdAt DESC |
Version history |
note_shares |
/productId, /shareToken |
Public share token lookup |
Retention Expectations
No container TTL is currently configured in code. Until TTL is explicitly added, retention is application-managed.
| Data class | Current release expectation | Future retention candidate |
|---|---|---|
| Notes, workspaces, relationships, tasks, artifacts | Retain until user delete/archive behavior says otherwise | User export/delete policy |
| Agent actions | Retain for audit trail through release 1 | 180-365 day TTL after compliance review |
| Note versions | Retain while version history is user-visible | Cap by count or age per note |
| Public shares | App-level expiry via expiresAt; revoked/expired shares should not grant access |
TTL after expiry + grace period |
| Intake jobs | Retain while user can inspect processing state | 30-90 day TTL |
| Prompt schedules/webhooks/templates | Retain until user deletion | None by default |
| Palace data | Retain as user-owned knowledge data | User delete/export policy |
Before enabling TTL, confirm export, audit, and rollback requirements for that data class.
Backup Approach
Production Cosmos accounts should use Azure-native backup configured outside this repo:
- Prefer continuous backup with point-in-time restore for production if cost and region support allow it.
- At minimum, enable periodic backup with a retention window that satisfies product and compliance needs.
- Record account name, database name, backup mode, retention window, and restore permissions in the environment release record.
- Keep Cosmos keys in Key Vault or the deployment secret manager; never commit keys.
Application-level export is not a database backup. GET /api/notes/export is user-scoped JSON/Markdown portability and does not include all operational records, versions, shares, collaborators, intake jobs, or Palace data.
Restore Approach
Use this restore sequence for incidents:
- Identify affected account, database, containers, product id, user ids, workspace ids, and time window.
- Prefer point-in-time restore into a new Cosmos account/database, not over the active database.
- Validate restored data with read-only queries scoped by
productId,userId, andworkspaceId. - Decide whether to cut traffic over to the restored database or run a scoped backfill into production.
- For scoped backfills, preserve original ids, partition keys,
productId, ownership fields, timestamps, and encryption state. - Run backend readiness and a representative authenticated note/workspace flow after restore.
- Record restored containers, restore timestamp, operator, validation queries, and follow-up migrations.
Operational Checks
Before release:
backend/src/lib/cosmos-init.tscontainer list matches this document.- Production config rejects
DB_PROVIDER=memory. - Field encryption is enabled with a production key provider.
- Cosmos backup mode and retention window are recorded in release notes.
- Query-risk follow-ups from
docs/COSMOS_QUERY_REVIEW.mdhave owners. - Any migration/backfill has a dry-run, stop criteria, and rollback note.
Related Docs
docs/COSMOS_QUERY_REVIEW.mddocs/DATA_MIGRATION_AND_BACKFILL_PLAN.mddocs/FIELD_ENCRYPTION_COVERAGE.mddocs/RELEASE_CHECKLIST.mddocs/IMPORT_EXPORT_READINESS.md