LESSONS_LEARNED.md — alaivOS Battle-Tested P0 Lessons¶
Last updated: April 13, 2026 (Omega v2.7) Status: Canonical. Single source of truth for P0 lessons, forbidden patterns, and Ground Truth (GT) rules.
Supersedes:
- files/OMEGA_V2_5_SESSION_HANDOVER.md (P0 lesson fragments)
- files/OMEGA_V2_6_SESSION_HANDOVER.md (TAW9-10 fix lessons)
- files/OMEGA_V2_7_SESSION_HANDOVER.md (thinking-mode + empty-catch lessons)
- files/STALE_CONTENT_AUDIT.md (stale-content incidents)
- files/P0_blank_screen_diagnosis.md, files/P0_clean_slate_rebuild.md
- files/P0v2_deep_diagnostic_from_masterplan.md, files/P0v2_multi_device_failure.md
- files/loose_bolts_audit.md, files/seeding_audit.md
- files/download_failure_report.md
- files/alpha_* diagnostic docs (APK size, Gemma theory, navigation, overlay, revised diagnostics)
- files/beta_* diagnostic docs (trace reports, masterplan fixes, circadian background)
- files/hotfix_*.md (download wakelock, import screen, onboarding keyboard, shared events, table name)
- files/architect_response_to_alpha_report.md
- files/masterplan_review_shortcomings.md, files/masterplan_v3_review.md
Digest: CLAUDE.md P0 Lessons section cross-references this file. Any lesson here overrides an older contradictory doc.
Format: Each lesson has Symptom / Root cause / Fix / Where / Discovered.
1. Widget / State / Navigation¶
1.1 Never switch a Stack child between Positioned and non-Positioned at runtime¶
- Symptom: Widget tree blows up with
Incorrect use of ParentDataWidgetassertion; screen goes blank or throws mid-frame on state change. - Root cause: Flutter's
RenderStackexpects each child's parent-data shape to be stable. Mutating a child fromPositioned(...)to a bareWidget(or vice versa) between rebuilds invalidates parent data, crashes render. - Fix: Decide Positioned-ness at mount time. Either always wrap in
Positioned.fill(and toggle visibility via opacity /IgnorePointer) or keep it always non-Positioned and use a sibling. - Where: Home screen Stack overlays, dock/scrim layers, OmniOrb compose stack.
- Discovered: P0v2 multi-device failure (pre-Omega), re-confirmed in dock scrim refactor.
1.2 KeyedSubtree inside a StatefulWidget's build() preserves State¶
- Symptom: Swapping subtrees caused child StatefulWidgets to lose scroll / form / controller state; intended.
- Root cause / Fix: Wrapping a subtree in
KeyedSubtree(key: const ValueKey('stable'))stabilizes element identity across rebuilds so State is preserved. Use this (not GlobalKey) when you need cheap identity stability inside a rebuild. - Where: Dock module hosts, tabbed module shells.
- Discovered: alpha_navigation_fix; re-used in Gamma dock refactor.
1.3 A ValueKey on MaterialApp is DESTRUCTIVE¶
- Symptom: Rebuild blew away the entire Navigator stack; deep-link, back-button, modal-dismiss state all lost; route animations reset.
- Root cause: Changing the key on
MaterialAppforces Flutter to dispose the whole app element tree, including Navigator. - Fix: Never put a
ValueKeyor any rebuild-changing key onMaterialApp. Scope keys to inner subtrees. Use theme/locale via provider, not key remount. - Where:
lib/app.dart/ root app widget. - Discovered: P0 blank screen diagnosis; architect_response_to_alpha_report.
1.4 Riverpod state throttle — max 2/sec for high-frequency updates¶
- Symptom: UI jank / dropped frames during model downloads; CPU pinned; battery heat.
- Root cause: Download progress callbacks fired 60-200/sec, each triggering
state = ...and a full dependent-widget rebuild. - Fix: Throttle at the notifier to max 2 updates/sec (500 ms window). Also applies to GPS stream consumers and traffic-factor tickers.
- Where:
model_download_provider.dart,gps_stream_provider.dart, traffic factor chip notifiers. - Discovered: hotfix_download_wakelock_chunked.
1.5 Stack child identity — stable keys only¶
- Symptom: Overlay flicker / remount on orientation change.
- Root cause: Anonymous children with no key — Flutter matched by type+index, which shifted when one child became conditional.
- Fix: Give every conditional Stack child a
const ValueKey('...'). Never rely on positional identity alone. - Where: Home scrim, map overlay, nav-mode banner.
- Discovered: alpha_updated_overlay_test.
2. Performance / Heat / Battery¶
2.1 SHA-256 on large files (1 GB+) blocks the UI thread¶
- Symptom: UI frozen 8-30 s after a model download finished; ANR warnings on mid-tier Androids.
- Root cause:
crypto.sha256.convert(bytes)is synchronous; hashing a 3 GB GGUF on the UI isolate blocks render. - Fix: Size-match validation (byte length against
manifest.json) is the default. SHA-256 is deferred to a background isolate and only runs for first-boot or tampering-check; never blocks download UI. - Where:
laiv_download_service.dart,model_registry.dart. - Discovered: download_failure_report; hotfix_download_wakelock_chunked.
2.2 Debug APK runs 3-5× hotter than release¶
- Symptom: Pixel 7 Pro thermal-throttled within 3 min on debug builds; users reported "phone burning."
- Root cause: Debug-mode Dart VM (JIT, assertions, observatory) + uncompiled tree-shake.
- Fix: Always test heat / battery on
flutter build apk --release(or--profile). Debug timings are meaningless for thermal QA. - Where: Release checklist, device-QA docs.
- Discovered: P0 clean slate rebuild.
2.3 Lazy module loading — modules init on first dock tap, not at startup¶
- Symptom: 7-12 s cold-start, 800 MB peak RAM at boot on mid-tier devices.
- Root cause: Every module (Map, Money, Health, People, Events, Notes, Chat, Streams, Laiv, ...) was initialized at
main(). - Fix:
ModuleLoader.ensureLoaded(moduleId)on first dock tap. AMI loads AI only on first Laiv interaction, behind splash/home animation. - Where:
module_loader.dart, dock tap handlers,ami_service.dart. - Discovered: Omega v1 cold-start sprint.
2.4 GPS discipline — 5 modes (idle / significantChange / mapExplore / navigating / paused)¶
- Symptom: 18-40% battery/hr drain with map open in background.
- Root cause: Always-on high-accuracy GPS stream.
- Fix: Centralized
GpsManagerwith 5 explicit modes.idle= off,significantChange= ~500 m coarse,mapExplore= 10 m medium,navigating= 1 Hz high,paused= frozen. - Where:
lib/core/location/gps_manager.dart. - Discovered: P0v2 deep diagnostic.
3. Data / SQLite¶
3.1 database is locked (code 5) — all writes MUST go through serializedWrite()¶
- Symptom: Random night-window crashes around 02:00-04:00 local; SQLite error code 5 in logs; embedding and ghost-credit writes aborted mid-batch.
- Root cause:
embedding_repository.dartandghost_credit_service.dartbypassed the serializer and opened parallel write transactions during 50+ concurrent night batch writes (embedding backfill + credit reconciliation overlap). - Fix: Routed both through the central
serializedWrite()mutex queue. Any new writer must use it; directdb.insert/db.updatefrom multiple isolates is banned. - Where:
lib/core/db/write_serializer.dart, callers inembedding_repository.dart,ghost_credit_service.dart. - Discovered: TAW9 (Omega v2.6).
3.2 Fresh-install / data-wipe onboarding loop¶
- Symptom: After clearing app data, signed-in user with a server profile was forced through the 45-question interview again.
- Root cause: Local DB had no profile row; onboarding gate checked local only.
- Fix: After login, query Supabase
user_profilesvia RLS-fallback query (treat RLS-denied as "exists elsewhere, retry"). Ifdisplay_nameexists, skip onboarding and restore locally. - Where:
onboarding_gate.dart,user_profile_repository.dart. - Discovered: TAW10 device-testing bugs (J's Pixel 7 Pro, V2_6 session).
3.3 Fake seed data removal (V2_5)¶
- Symptom: 47 placeholder rows ("Sample Event", "Demo Contact") shipped to users.
- Root cause: Dev-time seeds not gated by
kDebugMode. - Fix: All 47 removed; seeding utility now refuses to run in release.
- Where:
seed_data.dart, migration 0042. - Discovered: seeding_audit (Omega v2.5).
4. Auth / Onboarding¶
4.1 "Username always taken" bug¶
- Symptom: Every username (even random unique ones) reported as taken during signup.
- Root cause: The username-availability probe's
catch { return false; }treated any error — including Supabase RLS denials and network blips — as "taken." - Fix: Catch block now returns
true(treat unknown errors as "not taken, try it"). Real collisions are caught definitively at INSERT time by the unique constraint, with a clean UX error there. - Where:
username_service.dartisAvailable(). - Discovered: TAW10 BUG 1 (Omega v2.6).
4.2 Onboarding keyboard covering submit button¶
- Symptom: On short devices, user couldn't tap Continue because keyboard obscured it.
- Root cause: Screen used fixed layout; no scroll.
- Fix: Wrapped body in
SingleChildScrollViewwithresizeToAvoidBottomInset: trueand pinned submit as sticky bottom. - Where:
onboarding_question_screen.dart. - Discovered: hotfix_onboarding_keyboard.
5. Maps / POI / Tiles¶
5.1 POI labels cluttering the map at mid-zoom¶
- Symptom: At z14-15 the map was unreadable — cards stacked, labels overlapped, frame rate dropped.
- Root cause: Full glass POI cards rendered at every zoom level.
- Fix: Glass pin ladder — dots at z13-15, mini pins at z16, detail cards at z17+. Reduces draw calls and restores legibility.
- Where:
poi_marker_layer.dart,poi_render_policy.dart. - Discovered: TAW10 (Omega v2.6).
5.2 Tile cache looked broken — wasn't¶
- Symptom: Users reported tiles "re-downloading every session."
- Root cause: FMTC init was correct; perceived slowness came from the POI overlay's render complexity consuming the frame budget before tiles could paint.
- Fix: Don't touch FMTC. Fixing POI render (5.1) restored perceived tile speed.
- Where:
fmtc_init.dart(unchanged). - Discovered: TAW10.
5.3 OSRM coordinate order — longitude,latitude (NOT lat,lng)¶
- Symptom: Routes drawn across the ocean; ETA nonsense.
- Root cause: OSRM URLs expect
lon,latpairs; Flutter / Dart convention islat,lng. - Fix: All OSRM URL builders format as
${lng},${lat}. HelperOsrmCoord.of(LatLng)enforces this. Unit test inosrm_builder_test.dartpins the order. - Where:
osrm_client.dart. - Discovered: Early nav sprint; re-asserted in alpha_navigation_fix.
5.4 Nominatim requires User-Agent: alaivOS/1.0¶
- Symptom: 403 Forbidden from nominatim.openstreetmap.org.
- Root cause: Nominatim ToS requires an identifying UA; default Dart UA is blocked.
- Fix: All Nominatim HTTP calls set
User-Agent: alaivOS/1.0and include a contact fallback. - Where:
nominatim_client.dart. - Discovered: Early geocoding sprint.
5.5 Traffic factor chips — minutes, not percentages¶
- Symptom: Users didn't understand "+18% traffic" chip.
- Root cause: Engineers displayed the raw multiplier.
- Fix: Chips render as minutes: "Rain expected — adds ~8 min", "Holiday traffic — adds ~12 min". Five-layer composite ETA (
baseline_spline × live_calibration × weather × calendar × event) computes delta internally; UI shows human minutes. - Where:
traffic_factor_chip.dart,traffic_intelligence/. - Discovered: Omega v2.4 UX pass.
6. Networking / Streams / External APIs¶
6.1 Radio stream playback failures¶
- Symptom: Tapping a radio station produced no audio; silent fail.
- Root cause: Player hit playlist/redirect URLs (
.pls,.m3u) directly without resolvingcontent-type, and some servers required a UA header. - Fix: Radio pipeline now does HEAD request → inspect
content-type→ resolve actual stream URL, setsUser-Agentheader, logs every failure with station id + status. - Where:
radio_stream_resolver.dart. - Discovered: TAW10 (Omega v2.6).
6.2 Podcast "popular" keyword search noise¶
- Symptom: Searching "popular podcasts" returned shows with the word "popular" in the title.
- Root cause: iTunes Search API keyword endpoint was used as a discovery proxy.
- Fix: Swapped to the iTunes Top Podcasts feed for the default / trending surface. Keyword search retained only for explicit user queries.
- Where:
podcast_discovery_service.dart. - Discovered: TAW10.
6.3 Weather / AQ from CDN, NOT live Open-Meteo¶
- Symptom: Open-Meteo rate limits tripped; stale-looking weather tiles.
- Root cause: Every open of the map queried Open-Meteo directly.
- Fix: Weather and AQ come from the pipeline CDN (
cdn.alaivos.com) — ≤30 min stale, prewarmed for 290 cities. Only RainViewer radar tiles need live net. - Where:
weather_repository.dart,aq_repository.dart. - Discovered: P0v2; re-confirmed in STALE_CONTENT_AUDIT.
7. Localization / UI Primitives¶
7.1 Hardcoded English strings¶
- Symptom: Shipped English text in ES/PT builds; GT violation.
- Root cause: Devs typed user-visible text inline.
- Fix: Every user-visible string goes through
l10n(ARB × 21 locales, ~7,300 EN keys). Lint rule scans for string literals insideText(...),Tooltip(...),SnackBar(...). Forbidden to hardcode. - Where:
lib/l10n/*.arb, lint ruleno_hardcoded_strings. - Discovered: Omega v2.3 localization audit.
7.2 withOpacity is FORBIDDEN — use withValues(alpha: X)¶
- Symptom: Precision loss warnings (Flutter 3.27+); GT violation in audit.
- Root cause: Legacy API, deprecated.
- Fix: Codebase-wide swap to
.withValues(alpha: X). Pre-commit blocks any newwithOpacity. - Where: All UI files.
- Discovered: Omega v2.4.
7.3 debugPrint must be guarded by kDebugMode¶
- Symptom: Production logs polluted with debug spam; trivial PII leak risk; test harness tripped on unexpected output.
- Root cause: Bare
debugPrint(...)calls. - Fix: Every
debugPrintis wrapped: Multi-line form is required — a one-linerif (kDebugMode) debugPrint(...)breaks test-time string matching. Lint rule enforced. Unguarded count = 0. - Where: All files; lint
debug_print_guarded. - Discovered: Omega v2.5.
8. Privacy / Branding / Legal¶
8.1 "Offline AI" is a FORBIDDEN phrase¶
- Symptom: Marketing copy said "offline AI" — inaccurate (Ghost is online) and weak positioning.
- Root cause: Engineer shorthand leaked into user-visible strings.
- Fix: Canonical framing is "Zero-Data-Harvesting Architecture". "Local Supremacy" is secondary. "Offline AI" is banned in ARB, website, store listings, marketing.
- Where: ARB files,
WEBSITE_SPEC.md, store copy. - Discovered: Omega v2.2 brand pass.
8.2 Curated legal learning sources only¶
- Symptom: Early learning-module wireframes linked to anna-archive, oceanofpdf, pdfdrive.
- Root cause: Dev pulled first search results.
- Fix: Approved list only — Books: gutenberg.org, openlibrary.org, standardebooks.org, manybooks.net. Courses: classcentral.com, khanacademy.org, theodinproject.com, openculture.com, alison.com. Piracy domains banned — Apple/Google would reject the app.
- Where:
learning_sources.dart,LEGAL_AND_PRIVACY.md. - Discovered: legal review pre-submission.
8.3 Health data never leaves device¶
- Symptom: N/A — preventive rule.
- Root cause: Would violate Local Supremacy + HIPAA-adjacent sensitivities.
- Fix: Health data is never synced to cloud, never in AI prompts sent to Ghost. Enforced in
prompt_assembler.dartby exclusion filter + unit test. - Where:
prompt_assembler.dart,health_repository.dart. - Discovered: LOCKED decision pre-Omega.
9. Models / AI / AMI¶
9.1 AiProvider enum pruned to {local, ghost} only¶
- Symptom: Dead code paths for
gemini,openai,claudestill compiled; confusing contributors. - Root cause: Cloud Gemini was dropped but enum values remained.
- Fix: Enum reduced to
localandghost. All switches exhaustive. Gemini/OpenAI/Claude SDKs removed from pubspec. - Where:
ai_provider.dart,ai_router.dart. - Discovered: Omega v2.5.
9.2 Thinking mode OFF by default on Gemma 4 Instant¶
- Symptom: 25-35 s latency spikes on simple Ghost requests.
- Root cause: Gemma 4's
thinkingmode was defaulted on; for instant short replies (smart flutter responses, single-turn Laiv) it added a long chain-of-thought preamble. - Fix:
thinking: falseis the default for Ghost Instant. Turned on only for explicit deep tasks (plans, multi-step executor, research). Configurable per-skill viaSkillRouter. - Where:
ghost_client.dart,skill_router.dart. - Discovered: Omega v2.7.
9.3 Never two models loaded at once¶
- Symptom: OOM / swap thrash on mid-tier Androids.
- Root cause: AMI briefly held both an old and new tier during a swap.
- Fix: AMI unloads current model before loading the next. One model at a time. Zero always-resident. J explicitly rejected "0.8B always loaded."
- Where:
adaptive_model_manager.dart. - Discovered: alpha_apk_size_and_gemma_theory; re-asserted in LOCKED decisions.
9.4 Smart Flutter Responses bridge the model-load gap¶
- Symptom: Users tapped notifications and waited 3-6 s for Laiv while the model loaded → perceived unresponsiveness.
- Root cause: Notification tap was gated on AI.
- Fix: 10 data-driven notification handlers (
smart_flutter_response.dart+laiv_message_queue.dart) respond instantly from SQLite. If the user replies, real AI picks up (model now loaded in background). - Where:
smart_flutter_response.dart,laiv_message_queue.dart. - Discovered: TAW10 (Omega v2.6).
10. Lint / Ground Truth (GT) Violations¶
10.1 Empty catch blocks (silent failures)¶
- Symptom: Audit flagged ~200 items in TAW8:
try { ... } catch (_) {}that swallowed errors. - Root cause: Copy-paste defensiveness.
- Fix: Every catch must either (a) rethrow, (b) log via guarded
debugPrint, or (c) have an inline justification comment (// ignore: swallowed — <reason>). Sweep completed in TAW8 per Alpha; still pending Delta verification post-APK. - Where: codebase-wide.
- Discovered: TAW8.
10.2 GT violation count = 0 (current)¶
- Symptom: None — invariant.
- Rule:
dart analyzemust report 0 errors, 0 warnings, 0 GT violations. Infos allowed (~310). Tests must add, never replace. Count only goes UP. - Where: CI / pre-commit.
- Discovered: Project constitution (CLAUDE.md).
Forbidden Patterns (Quick Reference)¶
| Forbidden | Use Instead |
|---|---|
withOpacity(x) |
withValues(alpha: x) |
| Hardcoded English in UI | l10n.<key> |
Bare debugPrint(...) |
if (kDebugMode) { debugPrint(...); } |
ValueKey on MaterialApp |
Scope keys to subtrees only |
| Stack child switching Positioned ↔ non-Positioned | Fix Positioned-ness at mount |
Direct db.insert from multi-writer paths |
serializedWrite(...) |
| Live Open-Meteo on map open | CDN repository (weather_repository) |
| SHA-256 on 1 GB+ file on UI thread | Size-match + deferred hash in isolate |
| Debug APK for heat QA | Release / profile build |
| "Offline AI" phrase | "Zero-Data-Harvesting Architecture" |
| Piracy learning sources | Curated legal list only |
AiProvider.gemini / openai / claude |
AiProvider.local or AiProvider.ghost |
| Always-resident on-device model | AMI dynamic load/unload, one at a time |
OSRM lat,lng URLs |
lng,lat (longitude first) |
| Nominatim without UA | User-Agent: alaivOS/1.0 |
| Traffic chips in % | Minutes ("adds ~8 min") |
| POI full cards at z14-15 | Glass pins: dots z13-15, mini z16, detail z17+ |
setState for complex logic |
Riverpod ref.watch |
| PowerSync | Direct SQLite |
| Module init at startup | Lazy via ModuleLoader.ensureLoaded() |
| Always-on high-accuracy GPS | GpsManager 5-mode discipline |
| Gemini thinking-on default | thinking: false for Instant paths |
Pre-Commit Checklist¶
dart analyze— 0 errors, 0 warnings, 0 GT violations.flutter test— all passing; count has gone UP, not down.- Grep for
withOpacity— 0 hits. - Grep for unguarded
debugPrint— 0 hits. - Grep for hardcoded strings in
Text(/Tooltip(/SnackBar(— 0 hits. - Grep for
"offline AI"— 0 hits (case-insensitive). - New writer? Confirm it goes through
serializedWrite(). - New notification? Confirm Smart Flutter Response handler exists.
- New external API? Confirm it's not a paid dep (zero-paid-API rule).
- Heat test on release APK if performance-sensitive path touched.
GT Audit Rules¶
- GT1: Credits are the only Ghost gate — no capability gating.
- GT2: AiProvider has exactly 2 values:
local,ghost. - GT3: On-device = Qwen 3.5 only. Ghost = Gemma 4 E4B only.
- GT4: One model loaded at a time. Never two. Never always-resident.
- GT5: TTS v1.0 ship = Piper ONNX via sherpa_onnx. No ElevenLabs/WaveNet shipped.
- GT6: E2EE universal — every tier including Starter.
- GT7: Interactive map + voice nav + motorcycle time free for all tiers.
- GT8: Dock = 14 modules (scrollable). OmniOrb = modes. Separate concerns.
- GT9: Zero paid API deps in POI/search pipeline (Google = kill-switch only).
- GT10: Annual = pay 10 get 12.
- GT11: Trial = 14 Pro + 7 Elite, mandatory interview, mandatory Day 14 phone verify.
- GT12: Health data never synced to cloud, never in Ghost prompts.
- GT13: "Zero-Data-Harvesting Architecture" — never "offline AI".
- GT14: Annual pricing values read from
lib/config/pricing.dart, never hardcoded. - GT15: Tier enum = 7 values; trial tiers remap in
FeatureGate.
Alpha maintains this file. Any new P0 discovery adds a lesson in chronological order. Never delete lessons — lessons are cumulative history.