Skip to content

LESSONS_LEARNED.md — alaivOS Battle-Tested P0 Lessons

Last updated: April 13, 2026 (Omega v2.7) Status: Canonical. Single source of truth for P0 lessons, forbidden patterns, and Ground Truth (GT) rules.

Supersedes: - files/OMEGA_V2_5_SESSION_HANDOVER.md (P0 lesson fragments) - files/OMEGA_V2_6_SESSION_HANDOVER.md (TAW9-10 fix lessons) - files/OMEGA_V2_7_SESSION_HANDOVER.md (thinking-mode + empty-catch lessons) - files/STALE_CONTENT_AUDIT.md (stale-content incidents) - files/P0_blank_screen_diagnosis.md, files/P0_clean_slate_rebuild.md - files/P0v2_deep_diagnostic_from_masterplan.md, files/P0v2_multi_device_failure.md - files/loose_bolts_audit.md, files/seeding_audit.md - files/download_failure_report.md - files/alpha_* diagnostic docs (APK size, Gemma theory, navigation, overlay, revised diagnostics) - files/beta_* diagnostic docs (trace reports, masterplan fixes, circadian background) - files/hotfix_*.md (download wakelock, import screen, onboarding keyboard, shared events, table name) - files/architect_response_to_alpha_report.md - files/masterplan_review_shortcomings.md, files/masterplan_v3_review.md

Digest: CLAUDE.md P0 Lessons section cross-references this file. Any lesson here overrides an older contradictory doc.

Format: Each lesson has Symptom / Root cause / Fix / Where / Discovered.


1. Widget / State / Navigation

1.1 Never switch a Stack child between Positioned and non-Positioned at runtime

  • Symptom: Widget tree blows up with Incorrect use of ParentDataWidget assertion; screen goes blank or throws mid-frame on state change.
  • Root cause: Flutter's RenderStack expects each child's parent-data shape to be stable. Mutating a child from Positioned(...) to a bare Widget (or vice versa) between rebuilds invalidates parent data, crashes render.
  • Fix: Decide Positioned-ness at mount time. Either always wrap in Positioned.fill (and toggle visibility via opacity / IgnorePointer) or keep it always non-Positioned and use a sibling.
  • Where: Home screen Stack overlays, dock/scrim layers, OmniOrb compose stack.
  • Discovered: P0v2 multi-device failure (pre-Omega), re-confirmed in dock scrim refactor.

1.2 KeyedSubtree inside a StatefulWidget's build() preserves State

  • Symptom: Swapping subtrees caused child StatefulWidgets to lose scroll / form / controller state; intended.
  • Root cause / Fix: Wrapping a subtree in KeyedSubtree(key: const ValueKey('stable')) stabilizes element identity across rebuilds so State is preserved. Use this (not GlobalKey) when you need cheap identity stability inside a rebuild.
  • Where: Dock module hosts, tabbed module shells.
  • Discovered: alpha_navigation_fix; re-used in Gamma dock refactor.

1.3 A ValueKey on MaterialApp is DESTRUCTIVE

  • Symptom: Rebuild blew away the entire Navigator stack; deep-link, back-button, modal-dismiss state all lost; route animations reset.
  • Root cause: Changing the key on MaterialApp forces Flutter to dispose the whole app element tree, including Navigator.
  • Fix: Never put a ValueKey or any rebuild-changing key on MaterialApp. Scope keys to inner subtrees. Use theme/locale via provider, not key remount.
  • Where: lib/app.dart / root app widget.
  • Discovered: P0 blank screen diagnosis; architect_response_to_alpha_report.

1.4 Riverpod state throttle — max 2/sec for high-frequency updates

  • Symptom: UI jank / dropped frames during model downloads; CPU pinned; battery heat.
  • Root cause: Download progress callbacks fired 60-200/sec, each triggering state = ... and a full dependent-widget rebuild.
  • Fix: Throttle at the notifier to max 2 updates/sec (500 ms window). Also applies to GPS stream consumers and traffic-factor tickers.
  • Where: model_download_provider.dart, gps_stream_provider.dart, traffic factor chip notifiers.
  • Discovered: hotfix_download_wakelock_chunked.

1.5 Stack child identity — stable keys only

  • Symptom: Overlay flicker / remount on orientation change.
  • Root cause: Anonymous children with no key — Flutter matched by type+index, which shifted when one child became conditional.
  • Fix: Give every conditional Stack child a const ValueKey('...'). Never rely on positional identity alone.
  • Where: Home scrim, map overlay, nav-mode banner.
  • Discovered: alpha_updated_overlay_test.

2. Performance / Heat / Battery

2.1 SHA-256 on large files (1 GB+) blocks the UI thread

  • Symptom: UI frozen 8-30 s after a model download finished; ANR warnings on mid-tier Androids.
  • Root cause: crypto.sha256.convert(bytes) is synchronous; hashing a 3 GB GGUF on the UI isolate blocks render.
  • Fix: Size-match validation (byte length against manifest.json) is the default. SHA-256 is deferred to a background isolate and only runs for first-boot or tampering-check; never blocks download UI.
  • Where: laiv_download_service.dart, model_registry.dart.
  • Discovered: download_failure_report; hotfix_download_wakelock_chunked.

2.2 Debug APK runs 3-5× hotter than release

  • Symptom: Pixel 7 Pro thermal-throttled within 3 min on debug builds; users reported "phone burning."
  • Root cause: Debug-mode Dart VM (JIT, assertions, observatory) + uncompiled tree-shake.
  • Fix: Always test heat / battery on flutter build apk --release (or --profile). Debug timings are meaningless for thermal QA.
  • Where: Release checklist, device-QA docs.
  • Discovered: P0 clean slate rebuild.

2.3 Lazy module loading — modules init on first dock tap, not at startup

  • Symptom: 7-12 s cold-start, 800 MB peak RAM at boot on mid-tier devices.
  • Root cause: Every module (Map, Money, Health, People, Events, Notes, Chat, Streams, Laiv, ...) was initialized at main().
  • Fix: ModuleLoader.ensureLoaded(moduleId) on first dock tap. AMI loads AI only on first Laiv interaction, behind splash/home animation.
  • Where: module_loader.dart, dock tap handlers, ami_service.dart.
  • Discovered: Omega v1 cold-start sprint.

2.4 GPS discipline — 5 modes (idle / significantChange / mapExplore / navigating / paused)

  • Symptom: 18-40% battery/hr drain with map open in background.
  • Root cause: Always-on high-accuracy GPS stream.
  • Fix: Centralized GpsManager with 5 explicit modes. idle = off, significantChange = ~500 m coarse, mapExplore = 10 m medium, navigating = 1 Hz high, paused = frozen.
  • Where: lib/core/location/gps_manager.dart.
  • Discovered: P0v2 deep diagnostic.

3. Data / SQLite

3.1 database is locked (code 5) — all writes MUST go through serializedWrite()

  • Symptom: Random night-window crashes around 02:00-04:00 local; SQLite error code 5 in logs; embedding and ghost-credit writes aborted mid-batch.
  • Root cause: embedding_repository.dart and ghost_credit_service.dart bypassed the serializer and opened parallel write transactions during 50+ concurrent night batch writes (embedding backfill + credit reconciliation overlap).
  • Fix: Routed both through the central serializedWrite() mutex queue. Any new writer must use it; direct db.insert/db.update from multiple isolates is banned.
  • Where: lib/core/db/write_serializer.dart, callers in embedding_repository.dart, ghost_credit_service.dart.
  • Discovered: TAW9 (Omega v2.6).

3.2 Fresh-install / data-wipe onboarding loop

  • Symptom: After clearing app data, signed-in user with a server profile was forced through the 45-question interview again.
  • Root cause: Local DB had no profile row; onboarding gate checked local only.
  • Fix: After login, query Supabase user_profiles via RLS-fallback query (treat RLS-denied as "exists elsewhere, retry"). If display_name exists, skip onboarding and restore locally.
  • Where: onboarding_gate.dart, user_profile_repository.dart.
  • Discovered: TAW10 device-testing bugs (J's Pixel 7 Pro, V2_6 session).

3.3 Fake seed data removal (V2_5)

  • Symptom: 47 placeholder rows ("Sample Event", "Demo Contact") shipped to users.
  • Root cause: Dev-time seeds not gated by kDebugMode.
  • Fix: All 47 removed; seeding utility now refuses to run in release.
  • Where: seed_data.dart, migration 0042.
  • Discovered: seeding_audit (Omega v2.5).

4. Auth / Onboarding

4.1 "Username always taken" bug

  • Symptom: Every username (even random unique ones) reported as taken during signup.
  • Root cause: The username-availability probe's catch { return false; } treated any error — including Supabase RLS denials and network blips — as "taken."
  • Fix: Catch block now returns true (treat unknown errors as "not taken, try it"). Real collisions are caught definitively at INSERT time by the unique constraint, with a clean UX error there.
  • Where: username_service.dart isAvailable().
  • Discovered: TAW10 BUG 1 (Omega v2.6).

4.2 Onboarding keyboard covering submit button

  • Symptom: On short devices, user couldn't tap Continue because keyboard obscured it.
  • Root cause: Screen used fixed layout; no scroll.
  • Fix: Wrapped body in SingleChildScrollView with resizeToAvoidBottomInset: true and pinned submit as sticky bottom.
  • Where: onboarding_question_screen.dart.
  • Discovered: hotfix_onboarding_keyboard.

5. Maps / POI / Tiles

5.1 POI labels cluttering the map at mid-zoom

  • Symptom: At z14-15 the map was unreadable — cards stacked, labels overlapped, frame rate dropped.
  • Root cause: Full glass POI cards rendered at every zoom level.
  • Fix: Glass pin ladder — dots at z13-15, mini pins at z16, detail cards at z17+. Reduces draw calls and restores legibility.
  • Where: poi_marker_layer.dart, poi_render_policy.dart.
  • Discovered: TAW10 (Omega v2.6).

5.2 Tile cache looked broken — wasn't

  • Symptom: Users reported tiles "re-downloading every session."
  • Root cause: FMTC init was correct; perceived slowness came from the POI overlay's render complexity consuming the frame budget before tiles could paint.
  • Fix: Don't touch FMTC. Fixing POI render (5.1) restored perceived tile speed.
  • Where: fmtc_init.dart (unchanged).
  • Discovered: TAW10.

5.3 OSRM coordinate order — longitude,latitude (NOT lat,lng)

  • Symptom: Routes drawn across the ocean; ETA nonsense.
  • Root cause: OSRM URLs expect lon,lat pairs; Flutter / Dart convention is lat,lng.
  • Fix: All OSRM URL builders format as ${lng},${lat}. Helper OsrmCoord.of(LatLng) enforces this. Unit test in osrm_builder_test.dart pins the order.
  • Where: osrm_client.dart.
  • Discovered: Early nav sprint; re-asserted in alpha_navigation_fix.

5.4 Nominatim requires User-Agent: alaivOS/1.0

  • Symptom: 403 Forbidden from nominatim.openstreetmap.org.
  • Root cause: Nominatim ToS requires an identifying UA; default Dart UA is blocked.
  • Fix: All Nominatim HTTP calls set User-Agent: alaivOS/1.0 and include a contact fallback.
  • Where: nominatim_client.dart.
  • Discovered: Early geocoding sprint.

5.5 Traffic factor chips — minutes, not percentages

  • Symptom: Users didn't understand "+18% traffic" chip.
  • Root cause: Engineers displayed the raw multiplier.
  • Fix: Chips render as minutes: "Rain expected — adds ~8 min", "Holiday traffic — adds ~12 min". Five-layer composite ETA (baseline_spline × live_calibration × weather × calendar × event) computes delta internally; UI shows human minutes.
  • Where: traffic_factor_chip.dart, traffic_intelligence/.
  • Discovered: Omega v2.4 UX pass.

6. Networking / Streams / External APIs

6.1 Radio stream playback failures

  • Symptom: Tapping a radio station produced no audio; silent fail.
  • Root cause: Player hit playlist/redirect URLs (.pls, .m3u) directly without resolving content-type, and some servers required a UA header.
  • Fix: Radio pipeline now does HEAD request → inspect content-type → resolve actual stream URL, sets User-Agent header, logs every failure with station id + status.
  • Where: radio_stream_resolver.dart.
  • Discovered: TAW10 (Omega v2.6).
  • Symptom: Searching "popular podcasts" returned shows with the word "popular" in the title.
  • Root cause: iTunes Search API keyword endpoint was used as a discovery proxy.
  • Fix: Swapped to the iTunes Top Podcasts feed for the default / trending surface. Keyword search retained only for explicit user queries.
  • Where: podcast_discovery_service.dart.
  • Discovered: TAW10.

6.3 Weather / AQ from CDN, NOT live Open-Meteo

  • Symptom: Open-Meteo rate limits tripped; stale-looking weather tiles.
  • Root cause: Every open of the map queried Open-Meteo directly.
  • Fix: Weather and AQ come from the pipeline CDN (cdn.alaivos.com) — ≤30 min stale, prewarmed for 290 cities. Only RainViewer radar tiles need live net.
  • Where: weather_repository.dart, aq_repository.dart.
  • Discovered: P0v2; re-confirmed in STALE_CONTENT_AUDIT.

7. Localization / UI Primitives

7.1 Hardcoded English strings

  • Symptom: Shipped English text in ES/PT builds; GT violation.
  • Root cause: Devs typed user-visible text inline.
  • Fix: Every user-visible string goes through l10n (ARB × 21 locales, ~7,300 EN keys). Lint rule scans for string literals inside Text(...), Tooltip(...), SnackBar(...). Forbidden to hardcode.
  • Where: lib/l10n/*.arb, lint rule no_hardcoded_strings.
  • Discovered: Omega v2.3 localization audit.

7.2 withOpacity is FORBIDDEN — use withValues(alpha: X)

  • Symptom: Precision loss warnings (Flutter 3.27+); GT violation in audit.
  • Root cause: Legacy API, deprecated.
  • Fix: Codebase-wide swap to .withValues(alpha: X). Pre-commit blocks any new withOpacity.
  • Where: All UI files.
  • Discovered: Omega v2.4.

7.3 debugPrint must be guarded by kDebugMode

  • Symptom: Production logs polluted with debug spam; trivial PII leak risk; test harness tripped on unexpected output.
  • Root cause: Bare debugPrint(...) calls.
  • Fix: Every debugPrint is wrapped:
    if (kDebugMode) {
      debugPrint('...');
    }
    
    Multi-line form is required — a one-liner if (kDebugMode) debugPrint(...) breaks test-time string matching. Lint rule enforced. Unguarded count = 0.
  • Where: All files; lint debug_print_guarded.
  • Discovered: Omega v2.5.

8.1 "Offline AI" is a FORBIDDEN phrase

  • Symptom: Marketing copy said "offline AI" — inaccurate (Ghost is online) and weak positioning.
  • Root cause: Engineer shorthand leaked into user-visible strings.
  • Fix: Canonical framing is "Zero-Data-Harvesting Architecture". "Local Supremacy" is secondary. "Offline AI" is banned in ARB, website, store listings, marketing.
  • Where: ARB files, WEBSITE_SPEC.md, store copy.
  • Discovered: Omega v2.2 brand pass.
  • Symptom: Early learning-module wireframes linked to anna-archive, oceanofpdf, pdfdrive.
  • Root cause: Dev pulled first search results.
  • Fix: Approved list only — Books: gutenberg.org, openlibrary.org, standardebooks.org, manybooks.net. Courses: classcentral.com, khanacademy.org, theodinproject.com, openculture.com, alison.com. Piracy domains banned — Apple/Google would reject the app.
  • Where: learning_sources.dart, LEGAL_AND_PRIVACY.md.
  • Discovered: legal review pre-submission.

8.3 Health data never leaves device

  • Symptom: N/A — preventive rule.
  • Root cause: Would violate Local Supremacy + HIPAA-adjacent sensitivities.
  • Fix: Health data is never synced to cloud, never in AI prompts sent to Ghost. Enforced in prompt_assembler.dart by exclusion filter + unit test.
  • Where: prompt_assembler.dart, health_repository.dart.
  • Discovered: LOCKED decision pre-Omega.

9. Models / AI / AMI

9.1 AiProvider enum pruned to {local, ghost} only

  • Symptom: Dead code paths for gemini, openai, claude still compiled; confusing contributors.
  • Root cause: Cloud Gemini was dropped but enum values remained.
  • Fix: Enum reduced to local and ghost. All switches exhaustive. Gemini/OpenAI/Claude SDKs removed from pubspec.
  • Where: ai_provider.dart, ai_router.dart.
  • Discovered: Omega v2.5.

9.2 Thinking mode OFF by default on Gemma 4 Instant

  • Symptom: 25-35 s latency spikes on simple Ghost requests.
  • Root cause: Gemma 4's thinking mode was defaulted on; for instant short replies (smart flutter responses, single-turn Laiv) it added a long chain-of-thought preamble.
  • Fix: thinking: false is the default for Ghost Instant. Turned on only for explicit deep tasks (plans, multi-step executor, research). Configurable per-skill via SkillRouter.
  • Where: ghost_client.dart, skill_router.dart.
  • Discovered: Omega v2.7.

9.3 Never two models loaded at once

  • Symptom: OOM / swap thrash on mid-tier Androids.
  • Root cause: AMI briefly held both an old and new tier during a swap.
  • Fix: AMI unloads current model before loading the next. One model at a time. Zero always-resident. J explicitly rejected "0.8B always loaded."
  • Where: adaptive_model_manager.dart.
  • Discovered: alpha_apk_size_and_gemma_theory; re-asserted in LOCKED decisions.

9.4 Smart Flutter Responses bridge the model-load gap

  • Symptom: Users tapped notifications and waited 3-6 s for Laiv while the model loaded → perceived unresponsiveness.
  • Root cause: Notification tap was gated on AI.
  • Fix: 10 data-driven notification handlers (smart_flutter_response.dart + laiv_message_queue.dart) respond instantly from SQLite. If the user replies, real AI picks up (model now loaded in background).
  • Where: smart_flutter_response.dart, laiv_message_queue.dart.
  • Discovered: TAW10 (Omega v2.6).

10. Lint / Ground Truth (GT) Violations

10.1 Empty catch blocks (silent failures)

  • Symptom: Audit flagged ~200 items in TAW8: try { ... } catch (_) {} that swallowed errors.
  • Root cause: Copy-paste defensiveness.
  • Fix: Every catch must either (a) rethrow, (b) log via guarded debugPrint, or (c) have an inline justification comment (// ignore: swallowed — <reason>). Sweep completed in TAW8 per Alpha; still pending Delta verification post-APK.
  • Where: codebase-wide.
  • Discovered: TAW8.

10.2 GT violation count = 0 (current)

  • Symptom: None — invariant.
  • Rule: dart analyze must report 0 errors, 0 warnings, 0 GT violations. Infos allowed (~310). Tests must add, never replace. Count only goes UP.
  • Where: CI / pre-commit.
  • Discovered: Project constitution (CLAUDE.md).

Forbidden Patterns (Quick Reference)

Forbidden Use Instead
withOpacity(x) withValues(alpha: x)
Hardcoded English in UI l10n.<key>
Bare debugPrint(...) if (kDebugMode) { debugPrint(...); }
ValueKey on MaterialApp Scope keys to subtrees only
Stack child switching Positioned ↔ non-Positioned Fix Positioned-ness at mount
Direct db.insert from multi-writer paths serializedWrite(...)
Live Open-Meteo on map open CDN repository (weather_repository)
SHA-256 on 1 GB+ file on UI thread Size-match + deferred hash in isolate
Debug APK for heat QA Release / profile build
"Offline AI" phrase "Zero-Data-Harvesting Architecture"
Piracy learning sources Curated legal list only
AiProvider.gemini / openai / claude AiProvider.local or AiProvider.ghost
Always-resident on-device model AMI dynamic load/unload, one at a time
OSRM lat,lng URLs lng,lat (longitude first)
Nominatim without UA User-Agent: alaivOS/1.0
Traffic chips in % Minutes ("adds ~8 min")
POI full cards at z14-15 Glass pins: dots z13-15, mini z16, detail z17+
setState for complex logic Riverpod ref.watch
PowerSync Direct SQLite
Module init at startup Lazy via ModuleLoader.ensureLoaded()
Always-on high-accuracy GPS GpsManager 5-mode discipline
Gemini thinking-on default thinking: false for Instant paths

Pre-Commit Checklist

  1. dart analyze — 0 errors, 0 warnings, 0 GT violations.
  2. flutter test — all passing; count has gone UP, not down.
  3. Grep for withOpacity — 0 hits.
  4. Grep for unguarded debugPrint — 0 hits.
  5. Grep for hardcoded strings in Text( / Tooltip( / SnackBar( — 0 hits.
  6. Grep for "offline AI" — 0 hits (case-insensitive).
  7. New writer? Confirm it goes through serializedWrite().
  8. New notification? Confirm Smart Flutter Response handler exists.
  9. New external API? Confirm it's not a paid dep (zero-paid-API rule).
  10. Heat test on release APK if performance-sensitive path touched.

GT Audit Rules

  • GT1: Credits are the only Ghost gate — no capability gating.
  • GT2: AiProvider has exactly 2 values: local, ghost.
  • GT3: On-device = Qwen 3.5 only. Ghost = Gemma 4 E4B only.
  • GT4: One model loaded at a time. Never two. Never always-resident.
  • GT5: TTS v1.0 ship = Piper ONNX via sherpa_onnx. No ElevenLabs/WaveNet shipped.
  • GT6: E2EE universal — every tier including Starter.
  • GT7: Interactive map + voice nav + motorcycle time free for all tiers.
  • GT8: Dock = 14 modules (scrollable). OmniOrb = modes. Separate concerns.
  • GT9: Zero paid API deps in POI/search pipeline (Google = kill-switch only).
  • GT10: Annual = pay 10 get 12.
  • GT11: Trial = 14 Pro + 7 Elite, mandatory interview, mandatory Day 14 phone verify.
  • GT12: Health data never synced to cloud, never in Ghost prompts.
  • GT13: "Zero-Data-Harvesting Architecture" — never "offline AI".
  • GT14: Annual pricing values read from lib/config/pricing.dart, never hardcoded.
  • GT15: Tier enum = 7 values; trial tiers remap in FeatureGate.

Alpha maintains this file. Any new P0 discovery adds a lesson in chronological order. Never delete lessons — lessons are cumulative history.