VOICE_AND_AI — alaivOS Canonical¶
Last updated: April 13, 2026 (Omega v2.7)
Supersedes:
V1_1_AI_ROADMAP.md,V1_1_ROADMAP.mdOMEGA_V2_7_SESSION_HANDOVER.md,OMEGA_V2_6_SESSION_HANDOVER.md,OMEGA_V2_5_SESSION_HANDOVER.mdSPRINT_EPSILON_KOKORO_EVAL.md,EPSILON_CDN_RENAME.mdSPRINT_S1_SKILL_ROUTER.md,SPRINT_S2_OBSERVER_AGENT.md,SPRINT_S3_PLANNER_EXECUTOR.md,SPRINT_S4_PERSONALITY_UI.mdSPRINT_ALPHA_CHECKUP_PIPELINE.md,SPRINT_ALPHA_SKILL_AUDIT.md,SPRINT_ALPHA_PIPER_TTS.mdSPRINT_BETA2_ASK_LAIV.md,SPRINT_BETA_TTS_DEAD_CODE_SEVERANCE.md,SPRINT_VOICE_BETA.md,SPRINT_FIX_VOICE.mdSPRINT_GAMMA_LOCAL_MODEL_IMPL.mdsprint_A5_sovereign_tts_waveform_modes.md,SPRINT_B13_sovereign_voice_asr.mdsprint_A2_gguf_runtime_research.md,sprint_B10_gguf_runtime.md,gguf_runtime_research_report.mdsprint_A6_voice_translation.md,qwen_mistral_evaluation.mdsprint_B14_elevenlabs_tts.md(DEAD — historical traceability only)tiered_ai_capability_spec.md,alaivOS_qwen3_tts_strategy.md,alaivOS_ai_model_assessment.mdalaivOS_phone_ai_server_hub_architecture.mdlaiv_system_prompt_spec.md,alaivOS_AI_Prompt_Library_v2.md
Status overrides: ElevenLabs = DEAD (0 refs). Cloud Gemini = DEAD (enum + code removed). AiProvider enum = {local, ghost} only. Google Speech Services STT / Google WaveNet TTS = DEAD. Qwen 3.5 (NOT 2.5) is the canonical on-device family. Gemma 4 is server-only. Ghost model = Gemma 4 E4B. Voice pipeline is Kokoro-first (inverted). Per-skill pricing = DEAD.
Cross-reference: GHOST_PROTOCOL.md (credit economy, Ghost routing), PRODUCT_SCOPE.md (module inventory), INFRASTRUCTURE.md (CX43, Bishop, CDN).
1. MODEL LANDSCAPE — 7-TIER REGISTRY¶
1.1 Tier table (locked in TAW10 — Alpha-A model registry rewrite)¶
| Tier | CDN filename | Model | Download | Loaded RAM | Min free RAM | Role |
|---|---|---|---|---|---|---|
on-device-xs |
laiv-xs.gguf |
Qwen 3.5 0.8B Q4_K_M | 989 MB | 2.1 GB | 2.5 GB | Practical everyday tier for most flagships |
on-device-s |
laiv-s.gguf |
Qwen 3.5 2B Q4_K_M | 2.55 GB | 4.1 GB | 4.5 GB | Gap-filler between xs and Gemma; critical for 4-5 GB-free devices (LatAm/India) |
on-device-m |
laiv-m.gguf |
Qwen 3.5 4B Q4_K_M | 3.16 GB | 5.8 GB | 6.2 GB | Full on-device Chief of Staff (native early-fusion multimodal) |
on-device-l |
laiv-l.gguf |
Gemma 4 E2B Q4_K_M | 6.67 GB | 7.7 GB | 8.0 GB | Reserved for 16+ GB tablets / future phones — practically server-only today |
on-device-xl |
laiv-xl.gguf |
Gemma 4 E4B Q4_K_M | 8.95 GB | 10 GB | 10.5 GB | Reserved for tablets / future — practically server-only today |
ghost-std |
laiv-ghost.gguf |
Gemma 4 E4B Q4_K_M | 8.95 GB | 10 GB | — (server) | Ghost Brain on CX43, 12 tok/s |
Filenames are tier labels, not model names. When a better model drops, swap the GGUF behind the same filename. App reads manifest.json v3 for SHA + sizes. Backward compat aliases exist (laiv-core-s.bin / laiv-core-sm.bin / laiv-core-m.bin → Qwen files).
1.2 Qwen 3.5 capabilities (locked — NOT Qwen 2.5)¶
All Qwen 3.5 sizes share: - Unified vision (all sizes accept images; 4B is native early-fusion multimodal) - Thinking mode (off by default on 0.8B and 2B; do not flip on without explicit need — it explodes latency and hides output) - 262K context window - 201 languages (preserves multilingual parity; critical for 21-locale support) - Hermes-style tool calling (ChatML turn format; dual prompt templates live in app)
1.3 Ghost server model (locked — Epsilon TAW9 speed tests)¶
- Gemma 4 E4B @ 12.02 tok/s on CX43 CPU-only, 10 GB loaded RAM.
- Gemma 4 E2B @ 21.81 tok/s measured but unused on server (E4B quality preferred for Ghost).
- Qwen 3.5 9B preserved in Ollama as fallback only. Original 0.2 tok/s measurement was misconfigured thinking mode; with thinking off it runs at 5.58 tok/s — usable but 2× slower than Gemma.
- Function calling verified perfect in EN, ES, PT via Ollama native tool_calls. Example:
log_expense(amount=50, description="tacos")returns a proper tool_calls array. - Native audio input, native vision, native function calling in a single model.
1.4 Why Gemma 4 is server-only¶
E2B at 7.7 GB loaded and E4B at 10 GB loaded will not fit any shipping phone. Both are marketed as "edge" models but the loaded-RAM envelope rules phones out. On-device stays Qwen 3.5 exclusively. on-device-l / on-device-xl exist on the CDN and in the tier ladder for future 16+ GB devices and tablets.
1.5 Real-world AMI cascade (J's Pixel 7 Pro reality check)¶
12 GB total RAM → 3.6 GB free typical. Most flagship users will run Qwen 0.8B. Ghost is the real AI upgrade path.
| Device total | Typical free | Best model | Experience |
|---|---|---|---|
| 3-4 GB | 1-2 GB | None (Flutter smart only) | Data-driven, no model |
| 4 GB | 2-3 GB | Qwen 0.8B (2.1 GB) | Basic skills + vision |
| 6-8 GB | 3-4 GB | Qwen 0.8B | Same — 2B at 4.1 GB too tight |
| 8-12 GB | 3.5-5 GB | Qwen 0.8B or 2B | 2B only if apps closed |
| 12-16 GB | 5-7 GB | Qwen 2B or 4B | 4B (5.8 GB) tight on 12 GB |
| 16+ GB | 7-9 GB | Qwen 4B | Best on-device experience |
2. AMI — ADAPTIVE MODEL INTELLIGENCE¶
2.1 Cardinal rules¶
- ONE model loaded at a time, NEVER TWO. No model co-residency, no preloading of a bigger model while a smaller one serves.
- No always-resident model. J explicitly rejected "0.8B always loaded." Loading and unloading are dynamic on app lifecycle.
- App backgrounded → model unloaded. Zero RAM footprint, zero battery, zero heat while backgrounded.
- App foregrounded → AMI checks
freeRamMb→ picks best tier → loads during natural navigation time (2-8 s hidden behind splash/home rendering). User reaches Laiv with the model already warm. - Users never pick models. AI Engine screen shows dots (●●●○○) indicating the running tier; no file picker. "Manage AI brains" is a small link for power users.
2.2 Triggers (no polling tax)¶
- Android
onTrimMemory— RUNNING_MODERATE (prepare) → RUNNING_LOW (downgrade) → RUNNING_CRITICAL (unload all). - iOS
didReceiveMemoryWarning— only one level, so iOS unload threshold is more aggressive (modelSize × 1.5vs Android's 1.2). - Before-inference lazy check — only check RAM when the user actually asks Laiv something.
- 60-second health tick (only active when a model is loaded) — reads
/proc/meminfoon Android (microseconds, no battery impact),os_proc_available_memory()on iOS. Unloads proactively below threshold.
2.3 Decision flow¶
User asks Laiv something
→ Is a model loaded?
YES → Is free RAM > headroom (500 MB)?
YES → Use loaded model
NO → Unload, load 0.8B fallback, set "downgraded" flag
NO → Check free RAM
> 5.8 GB → Load 4B (if on disk)
> 4.1 GB → Load 2B (if on disk)
> 2.1 GB → Load 0.8B
< 2.1 GB → Smart Flutter Response (no model) or Ghost (if subscribed)
2.4 Downgrade UX¶
Not a popup. A dismissible glass card inside Laiv, shown once:
"Laiv adapted — your phone is busy with other apps right now. Full power returns automatically." [Switch back now] [Got it]
2.5 Load times (from flash)¶
| Tier | Load time | Perception |
|---|---|---|
| 0.8B | 1-2 s | Instant |
| 2B | 3-5 s | Brief pause |
| 4B | 6-10 s | Shows loading indicator |
2.6 Download chain (onboarding)¶
Every user gets laiv-xs (Qwen 0.8B, 989 MB) first — "Laiv is ready!" in ~30 s WiFi. Optimal tier for the device downloads in background. Stepping-stone tier downloads last as insurance. Dots upgrade silently when the better model finishes.
2.7 iOS-specific hardening¶
- Use
os_proc_available_memory()proactively. - Unload when available drops below
modelSize × 1.5(more aggressive than Android). - Always unload on app backgrounding to be a good citizen; reload on foreground.
3. SMART FLUTTER RESPONSE SYSTEM (bridges model-load gap)¶
Built by Beta in TAW10. Files: smart_flutter_response.dart, laiv_message_queue.dart.
- 10 notification-action handlers: budget exceeded, project due, free time, bill due, reconnect contact, exercise nudge, plan your day, cooking nudge, birthday, generic.
- Tier-gated: Starter = raw numbers · Spark+ = category breakdown · Core+ = follow-up suggestion.
- Personalized with the user's first name from SharedPreferences.
- Message queue captures user input while the model loads — messages never lost. When the model is ready it processes the queued messages; if its response adds value beyond the Flutter response it's appended as a follow-up, if redundant it is skipped silently.
3.1 Notification tap = instant Laiv, no OmniOrb needed¶
Notification taps route directly to the relevant screen + Laiv immediately greets with a data-driven response. Example: "Budget exceeded" → Money screen → Laiv says "José, you're $450 over budget. Dining hit $680." — all from SQLite, instant. The model loads in the 3-5 s the user spends reading; if they reply, real AI handles the follow-up.
4. RUNTIME — llama_cpp_dart + sherpa_onnx¶
4.1 Service name mappings (abstract → real)¶
| Abstract | Real implementation |
|---|---|
LocalModelService |
LocalInferenceService + LlamaRuntime (llama_cpp_dart FFI) |
AdaptiveModelManager / AMI |
Dynamic tier selection by freeRamMb, ONE model at a time, lifecycle-driven load/unload |
| TTS | SovereignTtsService (sherpa_onnx Piper ONNX), fallback CortexVoiceService (platform TTS) |
| Voice nav | NavVoiceService (audio focus + timing) + InstructionEnricher (OSRM steps → natural language × 21 locales) |
| Navigation | NavigationService (state machine: idle → navigating → rerouting → arrived) |
4.2 Prompt templates¶
Dual templates live in the app — selected per loaded model:
- Qwen 3.5 → ChatML turns (<|im_start|>system ... <|im_end|>).
- Gemma 4 → Turn-based (<start_of_turn>user ... <end_of_turn>).
4.3 Ghost native function calling¶
17 tool definitions registered for Gemma 4 E4B on CX43. Ollama native tool_calls array is consumed directly — no regex parsing. Verified EN/ES/PT.
5. MULTI-AGENT ARCHITECTURE — v1.0 (BUILT)¶
Shipped in sprints S1-S4 (Omega v2.5). Brain Distillation stays v1.1.
5.1 SkillRouter (S1, Alpha + Beta-1)¶
lib/core/ai/skill_router.dart— pluggable registry; two-phase match (keyword pre-filter → scored confidence).- Shrunk
LaivAgentfrom 882 → 294 lines. - 17 built-in skills (per Omega v2.7; raised from the original 12 during TAW device-write audit): log_expense, create_event, find_place, log_meal, check_budget, plan_trip, host_event, suggest_activity, draft_message, navigate, web_lookup, general_chat, plus 5 added during TAW/v2.7 for real writes on meds/sessions/sports/AQ/reconnect flows.
- Multilingual trigger keywords EN/ES/PT/FR/DE per skill (pre-filter only; the model does the real understanding).
LaivContextassembled byLaivContextAssemblerfrom Riverpod providers — single source of truth for all skills (user name, cluster, management style, tier, location, module state snapshot).- Unverified claim (v2.7): "Laiv skills execute real writes (17/17)" — device test pending.
5.2 ObserverAgent (S2, Beta-2) — READ-ONLY¶
lib/core/ai/observer/observer_agent.dart— never modifies data; only writes to theobservationsSQLite table.- 11 pattern rules: spending_spike, sleep_exercise, calendar_crunch, weather_spending, relationship_drift, habit_streak, commute_anomaly, meal_pattern, project_deadline, recurring_expense, AQ (rule #11, added in Omega v2.7).
- Runs on app_open / morning / evening. Dedupes within 48 h. Confidence threshold 0.6. Failing rules are caught and skipped — never crash the observer.
5.3 PlannerAgent + ExecutorAgent (S3, Alpha + Beta-1)¶
- Planner: reads pending observations → produces
ActionPlans. Max 5 plans per session. 6 action types:review_budget,draft_message,reschedule_event,create_reminder,adjust_departure,log_meal_suggestion. - Executor: NEVER writes to SQLite silently. Every autonomous action is rendered as an
ActionPlanCard; SQLite write happens only on explicit user confirmation. 8 module actions supported; partial-failure path triggers rollback. Exactly 1 call site app-wide.
5.4 PersonalitySettings (S4, Gamma)¶
SharedPreferenceskeys: warmth / verbosity / directness / humor (all 0-100) + preset.- 5 presets: Coach (default, matches Chief of Staff), Friend, Assistant, Mentor, Custom.
- 4 sliders with preview text; selecting a preset snaps sliders; adjusting a slider flips preset to Custom.
toStyleDirective()produces a natural-language block injected into PromptAssembler Layer 1 alongside the active persona fromai_personastable.
6. PROMPTASSEMBLER — 5 LAYERS¶
Total prompt budget ~2,000 tokens (0.8B) to ~4,000 tokens (Ghost). Every word counts.
| Layer | Content | Size |
|---|---|---|
| L1 — PERSONA (static) | Who Laiv IS + active persona (from ai_personas) + PersonalitySettings style directive |
~300 tok |
| L2 — USER (dynamic) | Name, cluster (if confidence ≥ 0.65), cluster behavioral note, management style directive, focus areas, roadblock | ~200 tok |
| L3 — STATE (real-time) | Today's events, budget status, weather, allergens, AQ context (v2.7), health snapshot | ~300 tok |
| L4 — MODULE (contextual) | Where the user is in the app, module-specific state | ~200 tok |
| L5 — ACTIONS (static) | Tool definitions the model can call | ~500 tok |
6.1 Layer 1 persona baseline ("Chief of Staff")¶
- Use contractions always. Default to 1-2 sentences. Expand only when the topic demands it.
- Never sycophantic ("Great question!"), never corporate (leverage, synergize, optimize).
- "I run entirely on the user's device. Their data never leaves their phone." Privacy stated once if asked, then move on.
- Vocabulary: "fine-tune" not "calibrate"; "set up" not "configure"; "your life" not "your data"; "I'll handle it" not "I'll process".
6.2 Cluster-specific L2 notes¶
13 cluster behavioral notes (juggler, warrior, professional, sovereign, student, creative, hustler, elder, chef, healer, tracker, scrapper, optimizer) — one-line directive each.
6.3 Management style directives¶
- gentle — warm, supportive, nudge. "Would you like me to..."
- strict — direct, accountable. "You missed yesterday's run. Let's not make it two."
- dashboard — minimal, reactive only. No proactive suggestions.
7. VOICE PIPELINE — KOKORO-FIRST (INVERTED)¶
7.1 Decision rationale¶
Users never heard the original ElevenLabs reference voice. Kokoro's approximation of it IS the first voice users hear. Piper is trained to match Kokoro (not ElevenLabs), so quality degrades gracefully. Voxtral clones Kokoro output for Ghost HD — same person, frontier quality across the chain.
7.2 The pipeline¶
ElevenLabs custom voice (existing reference audio, NEVER shipped to users)
↓ Bishop extracts StyleTTS 2 style vector
Kokoro 82M .pt (~500 KB) — "Laiv voice" canonical reference
↓ Kokoro generates reference corpus (500-1000 sentences EN/ES/PT)
├── Fine-tune Piper VITS → ONNX on-device (one per language, same speaker)
└── Feed as reference clip to Voxtral 3B → zero-shot embedding for Ghost HD (v1.1+)
7.3 Kokoro 82M (Ghost TTS, v1.0)¶
- StyleTTS 2 architecture, 82M params, Apache 2.0.
- 54 voices (11 female), 8 languages, CPU-only, sherpa_onnx-supported.
- Style vectors stored as
.ptfiles (~500 KB), NOT full model weights. - Zero-shot voice extraction from 15-20 s reference audio is approximate — that's fine because Kokoro IS the reference, not a clone target.
- Status: Epsilon eval sprint ready (
SPRINT_EPSILON_KOKORO_EVAL.md). Pending final voice selection — J listens to samples of all 11 female voices.
7.4 Piper (on-device TTS)¶
- v1.0 on-device:
en_US-hfc_female-medium(63 MB, Piper ONNX, bundled in APK) via sherpa_onnx. EN-accented in other languages but functional. - v1.0 / v1.0.1 on-device (Bishop ready): Per-language Piper models fine-tuned from Kokoro corpus. All sound like the same speaker.
- Fine-tuning data: 80-150 samples (5-15 min audio) per language. ~20 min on GPU or ~1-2 h on CPU (Bishop Ryzen AI 9 HX 370 + 64 GB DDR5).
- NO pre-built Piper ES/PT downloads — different speakers would break voice continuity.
- Target: v1.0 if Bishop provisioned in time, otherwise v1.0.1.
7.5 Voxtral 3B (Ghost HD, v1.1+)¶
Zero-shot embedding from Kokoro reference → frontier-quality multilingual Ghost HD voice. Trained on Bishop. 9 languages.
7.6 Voice stack (final)¶
| Tier | Engine | Voice source | Quality | Where |
|---|---|---|---|---|
| v1.0 on-device | Piper | Generic hfc_female (bundled) |
Good, EN accent in other langs | Phone |
| v1.0 Ghost | Kokoro 82M | Best female voice (pending J selection) | Near-natural, multilingual | CX43 |
| v1.0 / v1.0.1 on-device | Piper | Fine-tuned from Kokoro corpus | Good, Laiv voice, per-language | Phone |
| v1.1+ Ghost HD | Voxtral 3B | Cloned from Kokoro output | Frontier, 9 languages | Bishop training → CX43 inference |
7.7 Voice navigation¶
Uses the same Piper engine as Laiv TTS. NavVoiceService handles audio focus + timing; InstructionEnricher converts OSRM steps → natural language across 21 locales.
8. LAIV CHECKUP (v1.0 feature — Omega v2.7)¶
Overnight batch analysis of anonymized user data via Claude Sonnet 4.6 (Anthropic Batch API). Baked into subscription tiers — NOT Ghost credit-gated.
8.1 Three domains¶
Wellbeing · Planning · Financial Health.
8.2 Trial schedule¶
- Day 0 → Day 1 morning: Baseline checkup (Planning only — built from onboarding data). Delivered in the user's very first Morning Briefing. Sets the hook immediately.
- Day 14: Mid-trial (Planning updated + Wellbeing baseline). Delivered alongside Elite trial unlock.
- Day 28: Full checkup (all 3 domains) — FREE regardless of subscription status. Even if the user drops to Starter, they get it. The report IS the conversion pitch, not a paywall.
8.3 Tier cadence (post-trial)¶
Elite 1 mo · Pro 2 mo · Core 3 mo · Spark 6 mo · Starter none.
8.4 Dual anonymization pipeline¶
Device: PII-stripping collectors (zero PII in payload — aggregates, day-of-week only)
↓ HTTPS
CX43: Gemma 4 E4B anonymizer (second pass) → Anthropic Batch API as Citerius Holdings LLC
↓
Result: SQLite row retained locally forever (~$0.012 per checkup)
Collectors live in lib/core/services/checkup_collectors.dart (one per domain). Relay on CX43:8100. Service orchestrator: lib/core/services/checkup_service.dart.
9. LAIV BRAIN DISTILLATION — v1.1+ (NOT v1.0)¶
Needs real user data. Launch with vanilla Qwen 3.5 (stable, proven), fine-tune after 4-6 weeks of real usage.
- Target: custom fine-tunes of Qwen 3.5 0.8B / 2B / 4B on alaivOS-specific training data.
- Data: ~3,500 curated examples across 10 categories (brain dump, command routing, receipt OCR, meal → nutrition, daily conversations, financial patterns, travel, proactive suggestions, multilingual, error recovery). All include
<think>traces. - Training: Unsloth + HuggingFace TRL · SFT response-only loss · 1 epoch (avoids catastrophic forgetting) · Bishop (CPU) or 1× A100 80 GB rental.
- Cost: ~$475-710 total (Claude Opus data generation dominates).
- Timeline: 4-6 weeks post-launch.
- Custom eval suite: brain dump accuracy, receipt F1, module routing accuracy, meal estimation ±20% of USDA, response length, multilingual parity, voice/tone (J approval on 50 samples), hallucination rate.
- Deployment: CDN model swap;
manifest.jsonversion bump; users wake up to smarter Laiv with zero action. - Mistral Small 4 evaluation: deferred to v1.1+ (needs GEX44 revenue justification).
10. DEAD / REMOVED¶
| Item | Status | Notes |
|---|---|---|
| Cloud Gemini (AiProvider.gemini/openai/claude) | DEAD | Enum reduced to {local, ghost}; code removed |
| ElevenLabs shipped voice | DEAD | 72 refs → 0. Custom EL reference stays as Bishop training input only, never shipped |
| Google Speech Services STT / Google WaveNet TTS | DEAD | Replaced by on-device pipeline |
| Per-skill Ghost pricing | DEAD | Credits are the ONLY gate (see GHOST_PROTOCOL.md) |
| Qwen 3.5 9B on-device | DEAD | Ghost-only; CDN file retained for 16+ GB future phones |
| "0.8B always loaded" | DEAD | AMI dynamic load/unload only |
| Pre-built Piper ES/PT downloads | DEAD | Different speaker — breaks voice continuity |
| Qwen3-TTS self-host (Mar 2026 strategy doc) | Superseded | Replaced by Kokoro-first pipeline |
11. INFRASTRUCTURE TIE-IN¶
See INFRASTRUCTURE.md for full detail.
- CX43 (€17/mo): Ollama (Gemma 4 E4B) on :11434, ghost-router, sports-cache :8300, checkup-relay :8100, nginx :443, coturn :3478/5349. Gemma 4 E4B at 12 tok/s keeps Ghost viable on current hardware — no downgrade to CX23.
- Bishop (mini PC, not a GPU server): AMD Ryzen AI 9 HX 370, 64 GB DDR5, Radeon 890M iGPU + XDNA 2 NPU (50 TOPS), no discrete GPU, no CUDA. Role: one-time training (voice pipeline + Brain Distillation) + personal SmartLab. Training on CPU — slower but sufficient for one-time jobs.
- CDN (
cdn.alaivos.com/models/, Cloudflare R2): 7 tier files +manifest.jsonv3. - Scaling triggers: >500 Ghost subs → second node or CX43 upgrade; >1000 Ghost subs → dedicated GPU (GEX44 €184/mo).
12. KEY METRICS (April 13, 2026)¶
| Metric | Value |
|---|---|
| AiProvider enum values | 2 (local, ghost) |
| On-device tiers | 5 (xs/s/m/l/xl) — practical ceiling is m today |
| Ghost tier | 1 (ghost-std) |
| Ghost tok/s | 12.02 (Gemma 4 E4B, CPU-only) |
| Function-calling languages verified | EN / ES / PT |
| SkillRouter skills | 17 |
| Observer rules | 11 (incl. AQ #11) |
| Planner action types | 6 |
| Executor module actions | 8 |
| PersonalitySettings presets | 5 (Coach default) |
| PromptAssembler layers | 5 |
| PromptAssembler cluster notes | 13 |
| Checkup domains | 3 |
| Kokoro female voices shortlisted | 11 |
| Piper fine-tune samples needed | 80-150 per language |
| CDN model files | 7 + manifest.json |
This document is the single source of truth for alaivOS AI + Voice architecture. When it contradicts a sprint brief, this wins. Cross-reference: GHOST_PROTOCOL.md, PRODUCT_SCOPE.md, INFRASTRUCTURE.md.