feat(mobile): native background runtime + three-bundle split + cold-start overhaul#10969
feat(mobile): native background runtime + three-bundle split + cold-start overhaul#10969huhuanming wants to merge 348 commits intoxfrom
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
Two fixes for "Requiring unknown module" in three-bundle mode: 1. Per-runtime segment filter: each runtime's eager moduleFilter now only excludes its OWN segment paths (not the cross-runtime union). 2. expandSegmentsWithSyncDeps: after grouping serialized entries by segment, walk sync dep edges and add missing deps that are not in the eager bundle or any other segment. Each dep is added to exactly one segment (first-come) to avoid duplication. Expanded: main +8924, background +6603 sync deps added to segments. Module 12873 (bitcoinjs-lib/address.cjs) now defined in both background.bundle (eager) and Send segment (async).
- Add ensureTransportReady() to setupMainThreadBackgroundRunner that polls until transport reaches 'ready' state with timeout - BackgroundApiProxyBase now awaits transport.ensureReady() before dispatching RPC calls, replacing synchronous isEnabled() check - Improved installProdBundleLoader with runtime access control and tests - Add startup chain diagnostics across GlobalJotaiReady, SplashProvider, ThemeProvider, NavigationContainer, AppIntlProvider, SupabaseAuthProvider - iOS: enable ENABLE_NATIVE_BACKGROUND_THREAD in Info.plist, update AppDelegate and Xcode project config
…l-thread mode In dual-thread mode, webEmbedBridge is never set in the background thread because the JsBridge object cannot be serialized over SharedRPC. This caused callWebEmbedApiProxy to hang waiting for a bridge that never arrives, breaking wallet creation with a transport timeout. Add a reverse RPC channel (bg→main) so the background thread can request the main thread to perform webEmbed bridge calls and return results via SharedRPC.
Add NativeLogger diagnostics across the webembed bridge chain: WebView source resolution, bridge connect, transport sync, API readiness, and bridge call entry/result. Helps trace webembed failures in release builds.
When the same parent file has both `import { x } from 'mod'` (sync) and
`await import('mod')` (async), the serializer correctly classifies the
module as eager. But Metro still emits __loadBundleAsync for the async
edge, which fails because the module has no segment manifest entry.
- Runtime: resolve silently when segment key is not in manifest, since
the module is already available in the eager bundle
- Build: emit warnings listing all mixed sync+async import pairs so they
can be cleaned up over time
73e8398 to
0292eea
Compare
The assetResolutionPatch fixes the ../→_ path mismatch between Metro's httpServerLocation in JS and the actual file layout on disk. Previously only activated for OTA bundle updates. Now also applies in regular release builds (including split-bundle mode) where the same mismatch causes images and Lottie animations to not display.
…meout Two issues with DApp connection in dual-thread mode: 1. Navigation relay: ServiceDApp.openModal runs in the background thread where $navigationRef is unavailable. When no navigation ref exists, emit NavigateModalFromBackgroundThread event so the main thread's BackgroundNavigationRelay listener performs the actual navigation. 2. Bridge-call timeout: bridge-calls (DApp requests) may wait for user interaction (connection modal, tx signing). Use 5-minute timeout instead of 30s, and do NOT break the transport on bridge-call timeout since it only means the user didn't respond, not that the bg thread is dead.
…tion event Move @reown/appkit-* and @walletconnect/* out of main startup by converting GlobalWalletConnectModalContainer to event-driven lazy mount. The heavy SDK (~350 modules) is only loaded when the first WalletConnectOpenModal event fires. - GlobalWalletConnectModalContainer: render nothing until WC event, then lazy-load WalletConnectModalContainer and replay the buffered event - AppStateContainer (Android): replace direct WalletConnectModal hook import with event bus emission to close the modal
- check-startup-graph-budget: budget 18500→7700 modules, 50→34 MB; add FORBIDDEN_NPM_IN_MAIN (@KeystoneHQ, @reown, @bufbuild/protobuf) and FORBIDDEN_NPM_IN_COMMON (viem) checks against allocation reports - check-bundle-architecture: budget common 6500→4700, main 4000→7700, bg 4000→10000; add same forbidden npm package checks - startup-graph-budget.yml: run unionBuild first to produce allocation reports, then check both main and background entries plus architecture
Reproduces the qr-wallet-sdk scenario where a module becomes an ASYNC_DESC (sync child of a Gallery segment) after its sync import is replaced with await import(). The test verifies that expandSegmentsWithSyncDeps correctly pulls transitive dependencies into the segment. Note: the test passes because expandSegmentsWithSyncDeps works correctly in isolation. The actual production issue is that Metro's baseJSBundle does not include the transitive deps in serializedEntries, so expandSegmentsWithSyncDeps never sees them. This requires a deeper fix in unionBuild's serialization pipeline.
…allocation After Step 2's conservative rescan, modules with at least one unresolved parent stay unclassified — even when all their resolved parents are eager. This creates cascading dead-locks where shared modules (e.g. @babel/runtime, react-native core) block thousands of transitive dependents from being classified. Before: 9135/17755 (51%) modules unclassified in main graph, causing 6572 orphaned modules at runtime (exist in Metro graph but not in any bundle or segment). Add a Step 3 fallback pass that breaks the dead-lock: - If any resolved parent is eager → classify as eager (safe default) - If all resolved parents are segments → assign to first parent's segment - Final sweep: force any remaining unclassified modules to eager After: 0 unclassified modules, 0 orphaned module warnings.
…and break dep cycles Two fixes in buildSegmentAllocation: 1. ASYNC_ROOT promotion: when a module has both direct async import() parents AND sync parents from existing async descendants, promote it to its own ASYNC_ROOT instead of attaching as descendant. Without this, qr-wallet-sdk was lumped into the QRWalletGallery dev segment, causing runtime load failures when production code tried to import it. 2. Circular dependency removal in buildSegmentDeps: when segment A depends on B and B depends on A, break the cycle by removing the reverse edge. The runtime loader's cycle detection throws SegmentLoadError which gets silently swallowed by Suspense, causing blank pages (e.g. Perp ↔ SetTpslModal cycle blocked Home loading).
…ction error logging Move @onekeyhq/qr-wallet-sdk from sync to dynamic import() in three files, removing 690 modules (@KeystoneHQ, @bufbuild/protobuf, crypto libs) from common and main eager bundles: - components/QRCode: dynamic import in animated QR useEffect - SecureQRToast: dynamic import in debug onPress callbacks - useCreateQrWallet: dynamic import in createQrWallet callback Also add production-grade native error logging: - installProdBundleLoader: log segment load failures via NativeLogger - ErrorBoundary: log caught component errors via NativeLogger - LazyLoad: log lazy component load failures via NativeLogger - index.ts: install global ErrorUtils handler for uncaught JS errors
Add FORBIDDEN_IN_MAIN_SEGMENTS check to check-bundle-architecture.js: warns when core/chains or kit-bg/vaults segments have runtime=main. Currently 24 core.chains.*.CoreChainHd segments are in main because QRWalletGallery (dev page) sync-imports core.chains.evm, which pulls in all chain implementations. This is an architectural violation — core/chains belongs to background, not main. Also bump commonMaxSizeMB budget from 17 to 20 MB.
c5a9bbc to
64d692a
Compare
Add Metro resolver redirect: when UNION_BUILD=true, all imports of Developer/router and Developer/pages/Gallery/* resolve to an empty stub (router.empty.ts). This completely removes Gallery pages and their background-only transitive dependencies (core/chains, kit-bg/vaults, bitcoinjs-lib, qr-wallet-sdk) from the production Metro graph — they appear in zero bundles, segments, or manifests. Before: 24 core.chains.*.CoreChainHd segments with runtime=main After: 0 Gallery/Developer segments in any runtime Also add FORBIDDEN_IN_MAIN_SEGMENTS error check to check-bundle-architecture.js to prevent future regressions.
…ge imports
Replace `import { utils } from 'ethers'` in shared/utils with direct
imports from @ethersproject sub-packages to avoid pulling the full
ethers barrel (126 modules) into the common startup bundle.
- hexUtils.ts: import from @ethersproject/bytes and @ethersproject/strings
- ipTableUtils.ts: dynamic import of @ethersproject/wallet for verifyMessage
- ServiceIpTable.ts: await the now-async verifyIpTableConfigSignature
Result: @ethersproject in common 126 → 79 (-37%), ethers barrel 4 → 0.
… CJS Add Metro resolver redirect: lodash-es → lodash (CJS). Both versions co-existed in common (640 + 241 = 881 modules). Since project code and @onekeyfe/hd-core already use lodash CJS, aliasing lodash-es eliminates 639 redundant modules from common. Common: 4480 → 3778 modules (-702), 16.9 → 16.1 MB
- DotMap/utils.ts: import only english wordlist instead of full bip39 barrel (removes 11 language JSONs from common) - Remove Markdown from components barrel export; consumers now import directly from '@onekeyhq/components/src/content/Markdown'. This prevents markdown-it (86 modules) from being pulled into main eager via the components barrel. Main: 2448 → 2348 modules (-100), markdown-it 86 → 0
forbiddenInStartup (vaults, services) only applies to the main thread. Background thread naturally needs these at startup — checking them produced 41 false-positive violations. Vault impls are already loaded via dynamic import() in factory.ts (lazy per-chain loading). The 15 vault modules in background eager are settings/utils pulled by the global vault settings registry, not the Vault classes themselves. Background violations: 41 → 0
Short-circuiting the eager-module path to a fully synchronous require.importAll turned dynamic import() calls into synchronous lookups. Modules whose factories contain circular await import() edges rely on the microtask yield to break the recursion — once it was gone, Babel's regenerator-driven async generators re-entered themselves without ever suspending, Hermes piled PinnedHermesValues onto the GCScope handle stack, and the app crashed on launch with an EXC_BAD_ACCESS inside llvh::SmallVectorBase::grow_pod under deeply nested generatorPrototypeNext / functionPrototypeApply frames. Restore the original microtask yield and delegate to Metro's asyncRequire; the installProdBundleLoader eager-fallback path added in earlier commits already short-circuits the resulting __loadBundleAsync call gracefully, so the btc/sdkBtc flow keeps working without reintroducing the launch crash. Add a regression test asserting the eager path must not call asyncRequire synchronously before the returned promise suspends.
…ontainer's first-frame path
Previously `hasBalanceCacheInSnapshot` returned true whenever
lastConfirmedOverviewBalanceAtom had either a `latest` string or any
`byOwner` entry — a much looser signal than HomeOverviewContainer
actually needs to render a balance on frame 1. The container reads
`byOwner[currentOverviewOwnerKey]` where ownerKey is derived from the
active account/network, and the `latest` branch additionally requires
runtime-only atoms (overviewTokenCacheState / overviewDeFiDataState)
that have not hydrated yet. On startups where the atoms did not line
up, SplashProvider committed to waiting for HomePageReady, the event
never fired, and the splash hung for the full 5s safety window.
Tighten the probe to only commit to the fast path when the snapshot
carries:
1. lastConfirmedOverviewBalanceAtom.byOwner non-empty;
2. accountSelector@home activeAccountsAtom[0] hydrated with a concrete
account.id / network.id pair;
3. byOwner[`${account.id}__${network.id}`] resolves to a non-empty
string — the exact ownerKey HomeOverviewContainer will compute via
buildOverviewOwnerKey.
Anything short of that now falls back to path 3 (dismiss immediately),
which matches the reality that the container cannot emit HomePageReady
without a byOwner hit on the active account. Existing tests that relied
on the looser signal are updated and new cases cover every combination
(byOwner missing, activeAccounts missing, ownerKey mismatch, exact hit,
mixed with stale accountWorthAtom placeholder).
…yHQ/app-monorepo into codex/feat-split-background-thread
…dle scaffolding - Replace for-continue flow in hasBalanceCacheInSnapshot with Array.find lookups (no-continue) - Drop the stray blank line before fetchHttpModule in asyncRequireTpl.js (import-order empty-line) - Spell "behavior" / "mis-routing" per the in-repo cspell dictionary - Type the lazy require of healthCheck so scheduleSplitBundleHealthCheck is not treated as any (no-unsafe-call) - Switch the two "throw in mock" test setups to OneKeyLocalError to satisfy the custom onekey/no-raw-error rule (oxlint does not honor eslint-disable comments)
…mortem Emit a compact module-id-map.json from the union (and legacy) split-bundle serializer, then sync it into the Android APK assets, the iOS .app Resources, and the hot-deploy scripts. Lets us resolve runtime errors like `Requiring unknown module "8192"` straight from an installed build instead of having to ship multi-MB JS bundles or rely on CI artifact archives.
…and Android Adds a cross-platform timing schema so cold-start logs on both platforms share the same `[StartupTiming]` tag and label namespace, enabling direct data-level comparison between iOS and Android: - Shared labels: `main_host.did_start`, `bg_runner.start` - Android native labels (`android.app.*`, `android.activity.*`, `android.zygote_to_app_on_create`) cover the previously invisible window from zygote fork through `MainActivity.onCreate`. - iOS native labels (`ios.app.*`, `ios.main_entry.*`) replace the free-text hostDidStart / main entry messages with the schema and add previously unmeasured `didFinishLaunching` / `super.application` breakdowns. The `1k-cold-start-ssr` skill doc is updated with the full label table, iOS/Android baselines, log-pull commands for both platforms, and a one-liner parser for dashboarding.
… "+from launch" is real A Swift module-level `let` is lazily initialized — the timestamp captured by `private let appLaunchCFTime = CFAbsoluteTimeGetCurrent()` only fires on first read. After commit 18c6799 added `[StartupTiming] ios.app.did_finish_launching.start` as the very first reader, every "+from launch" delta collapsed to ~0ms, hiding the dyld + UIApplication setup window we wanted to measure. Move the anchor onto `AppDelegate.appLaunchCFTime` (a `static let`) and force its evaluation inside `AppDelegate.init()`, which `UIApplicationMain` calls just after dyld + UIApplication.init complete and before `didFinishLaunching` fires. Existing callsites still reference the same `appLaunchCFTime` symbol via a module-level computed-var passthrough, so no other call sites change. A future ObjC `+load` bootstrap could push the anchor earlier (into dyld itself), but that requires touching `project.pbxproj`; this pure-Swift fix captures everything except dyld/main/UIApplication.init (~100-200ms typical on modern iOS), which is enough to make the cross-platform timeline usable.
…roid numbers
The "Expected Timelines" section previously had estimate placeholders
("+~150ms", "+~820ms"). Replace with a 5-run cold-start sample taken on
`codex/feat-split-background-thread` after the unified StartupTiming
instrumentation landed:
- Native phase budget table (Application/Activity/RN host) for cold + warm
- Per-phase durations with cold/warm column for variance visibility
- Note that JS parse dominates (75-85% of TTI on both platforms)
- Cross-platform JS-side comparison table (iOS ~2× faster Hermes)
- Caveat that iOS baseline still pending the lazy-init fix (ee1877d)
Bypass HomePageReady / PendingInstallTaskProcessFinished event-based wait to test perceived TTI. With this change, splash hides ~300-500ms earlier in cold-start cases — at the cost of potentially flashing skeleton/empty UI for a brief moment before balance hydrates. Gated behind EXPERIMENT_DISMISS_SPLASH_AFTER_50MS const. Existing event listeners and processPendingInstallTask still attach so behavior on the non-fast paths is preserved if the flag is flipped off.
| refreshTokenListMap({ | ||
| tokens: mergeTokenListMap, | ||
| split: true, | ||
| }); |
There was a problem hiding this comment.
🟡 split: true passed to refreshTokenListMap which does not support it — dead property, likely intended for a different call
The call to refreshTokenListMap at TokenListBlock.tsx:1531-1534 passes { tokens: mergeTokenListMap, split: true }, but refreshTokenListMap's payload type (defined at packages/kit/src/states/jotai/contexts/tokenList/actions.ts:282-293) only destructures { tokens, merge, mergeDerive }. The split property is silently ignored. The split property IS supported by refreshTokenList (at actions.ts:176), which is called separately on line 1529 without it. This looks like split: true was accidentally added to the wrong function call.
| refreshTokenListMap({ | |
| tokens: mergeTokenListMap, | |
| split: true, | |
| }); | |
| refreshTokenListMap({ | |
| tokens: mergeTokenListMap, | |
| }); |
Was this helpful? React with 👍 or 👎 to provide feedback.
…and hideSplash logs 9 measurements showed the "SplashProvider mount → hideSplash invoked" window median was ~420ms even though the only scheduled delay was 50ms x 2. Under main-thread-busy (React mount / Provider tree / atoms hydrate) the JS scheduler starved both setTimeouts to 100-350ms each. Change A: SplashProvider experiment hook now calls setCanDismissSplash(true) synchronously in the mount useEffect instead of via setTimeout(50). React's state scheduler dispatches without going through the setTimeout task queue, so it doesn't compete with React's commit work. Change B: SplashView.native.tsx also drops its setTimeout(50) around hideSplash(). The effect that calls hideSplash() already runs post-commit, so the extra 50ms "let React commit" pad was not achieving its stated purpose and was instead adding ~320ms of starvation latency. Expected: hideSplash invoked moves from +2260ms to ~+1900ms JS-entry.
… in the 1.7s Previously `[StartupTiming] BG transport setup in XXms` aggregated 6 requires (setupMainThreadBackgroundRunner, react-native, expo, sentry, device-utils, ./App). Any single one could be dominating but we could not tell from the log. This commit emits `require.<name>: <dur>ms (+<cumul>ms from entry)` for each of the 6 requires so the next log makes the attribution unambiguous. NativeLogger is pre-loaded above the measurement window so its own import cost doesn't leak into the first `require.bgRunnerSetup` number.
A single build-time env var enables three layers of instrumentation:
1. [StartupProfile.js] — every module's factory self-time + total time,
via `__r` monkey-patch (inline-requires safe)
2. [StartupProfile.hbc] — main bundle I/O ms + size (pre-warm probe)
3. [StartupProfile.seg] — per-segment duration + id + path
When the env var is NOT set (default), every path is a single boolean
check — zero overhead in production bundles, so the instrumentation
can live on main indefinitely.
Revert: apps/mobile/index.ts drops the earlier `_timeRequire` helper
(commit 6f7ff92) — that approach didn't survive Metro's
inline-requires transform. Replaced with __r-level patching which works
regardless of how requires are hoisted.
Files:
- apps/mobile/src/startupProfile/index.ts (new, JS patcher)
- apps/mobile/index.ts (2-line hook)
- apps/mobile/plugins/index.js (Metro prologue inject)
- apps/mobile/src/splitBundle/installProdBundleLoader.ts (seg log)
- apps/mobile/ios/AppDelegate.swift (iOS HBC probe)
- apps/mobile/android/app/build.gradle (BuildConfig flag)
- apps/mobile/android/app/src/main/java/.../MainApplication.java (Android probe)
- .skillshare/skills/1k-startup-profile/skill.md (skill doc)
…uilds Introduces a dedicated EAS profile qa-internal-startup-profile extending qa-internal with ONEKEY_STARTUP_PROFILE=1. Workflow gains a boolean workflow_dispatch input ENABLE_STARTUP_PROFILE that, when true, forces that profile (internal APK only; no store submission). Behavior: - ENABLE_STARTUP_PROFILE=true -> qa-internal-startup-profile (flag ON) - ONEKEY_ALLOW_SKIP_GPG=true -> qa-internal (unchanged) - default -> production (unchanged) The flag propagates through EAS env -> (Metro prologue via plugins/index.js, Gradle via BuildConfig.ONEKEY_STARTUP_PROFILE, iOS via Info.plist/env) so all three layers (JS __r patcher, HBC I/O probe, segment log) turn on together in a single build. See .skillshare/skills/1k-startup-profile.
…ed segment When a module M is sync-required by two or more async roots (and not main- reachable), the previous Step-3 rescan arbitrarily picked `[...parentSegments][0]` and assigned M to that one segment. Which segment won depended on `graph.dependencies` Map insertion order — so any change to the dependency graph that perturbed the insertion order could flip ownership and make the other roots sync-require M across segments at runtime, producing `Requiring unknown module "NNNN"` crashes (seen with MobileTokenSelector ←→ MarketDetailV2.index via TokenSelector/constants). This promotes any multi-root sync-shared module to a dedicated `seg:shared.*` segment instead. Step 6's existing segment dependency graph already records each consumer segment as `dependsOn` of segments containing modules it sync-requires — so the runtime pre-loads the shared segment before any consumer's code runs, and the sync-require succeeds. The existing `loadedSegments.has(segmentKey)` check in `installProdBundleLoader` dedupes the shared segment so later consumers don't re-load it. Extracted the Step 3 loop into `segmentAllocator.reassignDescendantsToSegments` so the new rule can be unit-tested without a full Metro graph: - single-parent descendant → parent segment - any main-reachable parent → main - two+ segment parents → `seg:shared.*` - insertion-order independence (deterministic assignment) Added `deriveSharedSegmentKey` helper in `segmentUtils.js` and a matching test block in `segmentUtils.test.js`. All 18 pure-logic tests pass.
…ments
When a common-bundle (shared startup) module async-imports a segment,
that segment MUST be emitted with runtime=shared so both main and
background runtimes can resolve it. Previously, if the two runtimes'
copies of the segment had different module signatures (e.g., different
dependency edges due to graph traversal order), `canShare` returned
false and the segment was split into runtime-specific variants. Common
code — running in both runtimes — then couldn't find the other runtime's
variant, crashing with "segment missing from manifest".
This was the root cause of the `seg:nm.@formatjs` crash: intlShim (in
common bundle) does `await import('@formatjs/intl-locale/polyfill')`.
The polyfill modules' segment got runtime-variant-split, making it
invisible to whichever runtime intlShim happened to run in first.
The fix collects all segment keys that are async-import targets of any
shared-startup module, then forces `canShare = true` for those keys in
writeSegments(). This is correct because common code must see the same
segment in both runtimes.
Extracted `collectCommonReferencedSegmentKeys()` into unionBuildHelpers
for testability. Added 5 unit tests covering: async import from common
→ found in segment, no async imports, target not in segment, both
graphs contribute, non-common modules ignored.
All 52 tests pass (3 suites: segmentUtils, segmentAllocator,
unionBuildHelpers).
Local verification: fresh union build + gradle + deploy to emulator,
app starts without crash, seg:nm.@formatjs correctly emitted as shared.
APK size unchanged (~193 MB local vs ~204 MB CI, difference is
gradle/signing config, not this change).
| public ReactHost getReactHost() { | ||
| return ReactNativeHostWrapper.createReactHost(this.getApplicationContext(), this.getReactNativeHost()); | ||
| if (mReactHost == null) { | ||
| mReactHost = | ||
| ReactNativeHostWrapper.createReactHost( | ||
| this.getApplicationContext(), | ||
| this.getReactNativeHost() | ||
| ); | ||
| } | ||
| return mReactHost; | ||
| } |
There was a problem hiding this comment.
🔴 Race condition in lazy ReactHost initialization (getReactHost is not thread-safe)
getReactHost() uses a check-then-act pattern (if (mReactHost == null)) without synchronization. On Android, this method can be called concurrently from the UI thread and background threads (e.g., setupBackgroundThreadBootstrap at MainApplication.java:157 calls getReactHost() during onCreate, while the framework may also call it from ReactActivity). Two threads seeing mReactHost == null simultaneously would both create a ReactHost, leading to duplicate initialization and potential crashes or state corruption.
| public ReactHost getReactHost() { | |
| return ReactNativeHostWrapper.createReactHost(this.getApplicationContext(), this.getReactNativeHost()); | |
| if (mReactHost == null) { | |
| mReactHost = | |
| ReactNativeHostWrapper.createReactHost( | |
| this.getApplicationContext(), | |
| this.getReactNativeHost() | |
| ); | |
| } | |
| return mReactHost; | |
| } | |
| @Nullable | |
| @Override | |
| public synchronized ReactHost getReactHost() { | |
| if (mReactHost == null) { | |
| mReactHost = | |
| ReactNativeHostWrapper.createReactHost( | |
| this.getApplicationContext(), | |
| this.getReactNativeHost() | |
| ); | |
| } | |
| return mReactHost; | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
…nt sync violations
A segment's emitted bundle may sync-require a module defined in another
segment (`__d(fn, id, [deps])` lists the dep id). For this to not crash
at runtime with "Requiring unknown module <id>", the depended-on segment
must be transitively listed in the segment's `dependsOn` chain so the
runtime loader pre-loads it.
Previously this invariant was only informally maintained by the
allocator. When it slipped (as it did for `seg:nm.@formatjs` and later
for `seg:kit.views.Market.MarketDetailV2.components.TokenSelector.MobileTokenSelector`
→ `seg:kit.views.Market.MarketDetailV2.index`'s `constants.ts`), the
break only surfaced on device at user-visible crash time.
`scripts/check-split-bundle-integrity.js` parses every emitted `.seg.js`
under `dist/segments{,-background}/` and, for each module definition:
1. skips dep ids resolved by the eager bundle (`common` + runtime);
2. skips dep ids defined in the same segment;
3. requires dep's segment to appear in the source segment's transitive
`dependsOn` closure — otherwise reports a `cross_segment_sync`
violation.
The script also reports `missing_manifest_entry` for any `.seg.js` that
has no corresponding manifest record (another latent structural bug).
Wired into `build-bundle.js` right after `runUnionBuild()` so a failure
aborts the build before Hermes compilation. Can be bypassed with
`ONEKEY_SKIP_SPLIT_INTEGRITY_CHECK=1` for local debugging only.
Unit tests in `apps/mobile/scripts/__tests__/check-split-bundle-integrity.test.js`:
- parser: Metro `__d()` shape with nested braces, escaped quotes, `__d(`
tokens inside strings, empty input
- transitive closure: deep chains, cycles, orphans
- integration fixtures mirroring the real `MobileTokenSelector → constants`
crash shape, plus passing counterparts with `dependsOn` coverage,
transitive coverage, and eager-resolution cases
All 93 mobile-related Jest tests pass on the clean local build; the
script reports 0 violations locally, meaning any future regression that
breaks the dependsOn invariant will be caught at build time.
| `${nodeExecutablePath} ${path.join( | ||
| mobileDirPath, | ||
| 'scripts/check-split-bundle-integrity.js', | ||
| )}`, |
Check warning
Code scanning / CodeQL
Shell command built from environment values Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 11 hours ago
Use a non-shell invocation so dynamic paths are passed as arguments, not interpolated into a shell command string.
Best fix in apps/mobile/build-bundle.js (around lines 768–774): replace the execSync string command with spawnSync (or execFileSync) using:
- command:
nodeExecutablePath - args:
[path.join(mobileDirPath, 'scripts/check-split-bundle-integrity.js')] - options:
{ stdio: 'inherit' }
This preserves functionality (runs the same node executable against the same script, inherits stdio) while preventing shell interpretation of path content. Since spawn is already imported, update import to include spawnSync and remove unused spawn if desired.
| @@ -2,7 +2,7 @@ | ||
| /* cspell:ignore debugid */ | ||
| require('../../development/env'); | ||
|
|
||
| const { execSync, spawn } = require('child_process'); | ||
| const { spawnSync } = require('child_process'); | ||
| const crypto = require('crypto'); | ||
| const path = require('path'); | ||
|
|
||
| @@ -765,11 +765,14 @@ | ||
| // bypass during local debugging (never in CI). | ||
| if (process.env.ONEKEY_SKIP_SPLIT_INTEGRITY_CHECK !== '1') { | ||
| log('union build: split-bundle integrity check'); | ||
| execSync( | ||
| `${nodeExecutablePath} ${path.join( | ||
| mobileDirPath, | ||
| 'scripts/check-split-bundle-integrity.js', | ||
| )}`, | ||
| spawnSync( | ||
| nodeExecutablePath, | ||
| [ | ||
| path.join( | ||
| mobileDirPath, | ||
| 'scripts/check-split-bundle-integrity.js', | ||
| ), | ||
| ], | ||
| { stdio: 'inherit' }, | ||
| ); | ||
| log('union build: split-bundle integrity check passed'); |
| function flushColdStartCache() { | ||
| if (coldStartDirtyKeys.size === 0) return; | ||
| coldStartLog( | ||
| `flush: ${coldStartDirtyKeys.size} dirty keys: ${[...coldStartDirtyKeys].join(', ')}`, | ||
| ); | ||
| try { | ||
| // eslint-disable-next-line @typescript-eslint/no-require-imports | ||
| const { coldStartCacheStorage } = | ||
| require('@onekeyhq/shared/src/storage/instance/syncStorageInstance') as typeof import('@onekeyhq/shared/src/storage/instance/syncStorageInstance'); | ||
| // eslint-disable-next-line @typescript-eslint/no-require-imports | ||
| const { EAppSyncStorageKeys } = | ||
| require('@onekeyhq/shared/src/storage/syncStorageKeys') as typeof import('@onekeyhq/shared/src/storage/syncStorageKeys'); | ||
|
|
||
| // Read-modify-write: patch only dirty keys into existing snapshot. | ||
| // This preserves cached values for scopes not rendered this session. | ||
| // Safe because all callers (debounce timer + AppState) are on main thread. | ||
| const raw = coldStartCacheStorage.getString( | ||
| EAppSyncStorageKeys.onekey_jotai_context_atoms_snapshot, | ||
| ); | ||
| const snapshot = raw ? JSON.parse(raw) : {}; | ||
|
|
||
| for (const name of coldStartDirtyKeys) { | ||
| snapshot[name] = coldStartValuesMap.get(name); | ||
| } | ||
|
|
||
| coldStartCacheStorage.set( | ||
| EAppSyncStorageKeys.onekey_jotai_context_atoms_snapshot, | ||
| JSON.stringify(snapshot), | ||
| ); | ||
| coldStartDirtyKeys.clear(); | ||
| } catch { | ||
| /* best-effort */ | ||
| } | ||
| } |
There was a problem hiding this comment.
🟡 Cold start cache flush can re-persist stale data after app data clear due to cross-thread race
In dual-thread mode, ServiceApp.clearDataStep calls coldStartCacheStorage.clearAll() on the background thread (packages/kit-bg/src/services/ServiceApp.ts:121). However, the main thread's flushColdStartCache function (packages/kit-bg/src/states/jotai/utils/index.ts:425) may have a pending 2-second debounce timer (coldStartSaveTimer) that fires AFTER the background thread clears the MMKV. When it fires, it reads the now-empty MMKV, patches in stale values from the in-memory coldStartValuesMap (which was never cleared on the main thread), and writes them back. The ensureColdStartAppStateListener flush-on-background also races.
The result: after "Clear All Data", the cold-start cache snapshot can be silently re-populated with stale context atom values. On the next cold start, the app renders stale balance/token data briefly until the network refresh overwrites it.
Prompt for agents
The flushColdStartCache function and the ServiceApp.clearDataStep operate on the same MMKV instance from different JS threads (main vs background). When ServiceApp clears the cold-start-cache MMKV, the main threads debounce timer (coldStartSaveTimer) may still fire and re-persist stale values from the in-memory coldStartValuesMap.
Fix approach: either (1) export and call a clearColdStartInMemoryState() function from ServiceApp that clears coldStartValuesMap, coldStartDirtyKeys, and cancels coldStartSaveTimer before clearing MMKV — this requires cross-thread coordination via appEventBus; or (2) have flushColdStartCache check a 'resetting' flag before writing; or (3) send a 'clear-cold-start-cache' app event from BG to main thread so the main thread clears its in-memory state before the MMKV clear.
Was this helpful? React with 👍 or 👎 to provide feedback.
| await this.executePendingInstallTask(runningTask); | ||
| shouldEmitProcessFinishedEvent = false; |
There was a problem hiding this comment.
🟡 Successful executePendingInstallTask suppresses PendingInstallTaskProcessFinished event, leaving splash stuck if restart doesn't happen
In servicePendingInstallTask.ts, when executePendingInstallTask succeeds (line 1502-1503), shouldEmitProcessFinishedEvent is set to false because a restart is expected. However, the SplashProvider (packages/kit/src/provider/SplashProvider.tsx) listens for PendingInstallTaskProcessFinished to dismiss the splash screen when hasPendingInstallTask() returns true. If the OTA bundle switch succeeds but the native restart mechanism fails or is delayed, no PendingInstallTaskProcessFinished event is emitted and the splash screen remains stuck until the 5-second safety timer fires.
Code flow
- SplashProvider sees
hasPendingInstallTask()=true→ listens forPendingInstallTaskProcessFinished executePendingInstallTasksucceeds →shouldEmitProcessFinishedEvent = false(line 1503)- Native restart doesn't happen immediately
- Splash stays stuck for up to 5 seconds (safety timer)
With EXPERIMENT_DISMISS_SPLASH_ON_MOUNT = true this is currently masked, but if the experiment is reverted to false, users would see a 5-second splash delay on every successful OTA apply.
Was this helpful? React with 👍 or 👎 to provide feedback.
EAS's gradle unionBuildProdReleaseJsBundle task doesn't populate out-dir-bundle/<platform>/ — that path is only written by the local build-bundle.js --platform flow. unionBuild.js always writes the authoritative map to apps/mobile/dist/module-id-map.json, so prefer that and fall back to out-dir-bundle only when missing.
| await Promise.all( | ||
| expectedKeys.map(async (key) => { | ||
| try { | ||
| const value = await this.readFromAsyncStorage(key); | ||
| if (value !== null && value !== undefined) { | ||
| this.store.set(key as any, JSON.stringify(value) ?? ''); | ||
| migrated += 1; | ||
| } else { | ||
| absent += 1; | ||
| } | ||
| } catch (e) { | ||
| errors += 1; | ||
| this.log(`migration read error for ${key}: ${(e as Error)?.message}`); | ||
| } | ||
| }), | ||
| ); |
There was a problem hiding this comment.
🟡 Migration counter variables migrated, absent, errors mutated concurrently in Promise.all
In JotaiStorageNativeMMKV.migrateFromAsyncStorage(), the migrated, absent, and errors counter variables are incremented inside concurrent Promise.all callbacks without synchronization. While JavaScript is single-threaded so there's no true data race, the variables are let locals captured by multiple async closures that interleave at await points. In practice this works correctly because += 1 on a number is atomic in the JS event loop (each increment is a single-tick operation between awaits), but the final count log message may have slightly stale values if the engine reorders microtask completion. This is non-severe since it only affects diagnostic counters.
Was this helpful? React with 👍 or 👎 to provide feedback.
segment-manifest.json only carries dependsOn/id/runtime/sha256 — the per- segment modules map lives in module-id-map.json's `.segments[key].modules`. The previous buildModuleIndexFromManifest(manifest) call silently returned an empty Map against the real manifest, which made every cross-segment dep lookup fall through to "not in any segment in this runtime" and turned the whole integrity check into a no-op that always reported zero violations. Add a new buildModuleIndex(idMap, manifest) that reads from the right place (filtered by manifest membership for per-runtime scoping). Keep buildModuleIndexFromManifest as a compat shim for the embedded-modules fixture shape used by earlier tests. Rewrite the integration-test fixtures with a shared buildSegFixtures() helper so manifest and idMap mirror the real on-disk separation. Add a regression test that uses the exact shape segmentSerializer emits (manifest entries without `modules`) — it would have failed against the previous code.
| setResult(r); | ||
| if (swrKeyRef.current) { | ||
| swrCacheUtils.set(swrKeyRef.current, r); | ||
| } |
There was a problem hiding this comment.
🟡 SWR cache stores undefined results, poisoning initResult on subsequent mounts
When swrKey is set and the async method resolves to undefined, swrCacheUtils.set(swrKeyRef.current, r) writes {d: undefined, t: ...} to the SWR store. On the next mount, swrCacheUtils.getWithTimestamp returns {data: undefined, updatedAt: ...} (a defined object), so the check swrCacheEntry !== undefined at line 117 is truthy, and effectiveInitResult becomes undefined — overriding any explicit options.initResult. This means a hook that temporarily resolves to undefined (e.g., during an edge-case empty response) will have its initResult override permanently suppressed for the lifetime of that cache entry.
How the poisoning chain works
method()resolves toundefined→ line 241 sets result- Line 243:
swrCacheUtils.set(key, undefined)→ store entry{d: undefined, t: now} - Next mount:
getWithTimestamp(key)returns{data: undefined, ...}(notundefined) - Line 117-118:
swrCacheEntry !== undefinedistrue→effectiveInitResult = undefined options.initResultis completely ignored
| setResult(r); | |
| if (swrKeyRef.current) { | |
| swrCacheUtils.set(swrKeyRef.current, r); | |
| } | |
| if (shouldSetState(config) && nonceRef.current === nonce) { | |
| setResult(r); | |
| if (swrKeyRef.current && r !== undefined) { | |
| swrCacheUtils.set(swrKeyRef.current, r); | |
| } | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
unionBuild.js bypasses Metro's customSerializer (calls `baseJSBundle()` directly), so the `ONEKEY_STARTUP_PROFILE=1` prologue injection that lived in plugins/index.js never ran for EAS union builds. Result: the `qa-internal-startup-profile` APK shipped without `globalThis.__ONEKEY_STARTUP_PROFILE__`, so `installStartupProfileJs()` returned immediately and no `[StartupProfile.js]` lines appeared in logs — the whole JS-side profile was silently dead. Move the prologue builder into a shared helper and call it from both the default Metro path (plugins/index.js) and the union-build path (unionBuild.js writeBundle). Injection happens in the common bundle's preSection — common loads first in both runtimes, so the global flag is set before any __d or the main entry's install hook. Adds unit tests covering the helper's env gating, id→path trimming, and degenerate-input handling, plus a regression test that asserts unionBuild.js still imports and calls the helper inside an includePre-gated branch.
| }); | ||
| if (shouldSetState(config) && nonceRef.current === nonce) { | ||
| setResult(r); |
There was a problem hiding this comment.
🔴 SWR cache writes stale data on swrKey change because swrKeyRef.current is updated after the write
In usePromiseResult.ts, the SWR cache write on line 241 uses swrKeyRef.current to decide which key to write the fresh result to. However, when swrKey changes between renders, swrKeyRef.current is updated at render time (line 109) before the async method re-runs. The async method closure captures swrKeyRef (a ref), so it reads the new key after the prop change — which is correct. But the issue is subtle: swrKeyRef.current = swrKey at usePromiseResult.ts:109 runs on every render, so when the async method from a previous swrKey scope resolves after the key has already changed, it writes its stale result to the new key because swrKeyRef.current already points to the new key. This causes cross-scope cache pollution — e.g., wallet A's data could be written under wallet B's cache key if the user switches wallets while a request is in-flight.
Prompt for agents
In packages/kit/src/hooks/usePromiseResult.ts, the SWR cache write at line 241 uses swrKeyRef.current which may have already been updated to a new key by the time an in-flight async method from a previous key scope resolves. This causes stale data from the old scope to be written to the new scope's cache key.
Fix approach: Capture the swrKey at the time the async method is invoked (inside the run/execute function, not via the ref), and use that captured key for the cache write. For example, capture `const capturedSwrKey = swrKeyRef.current` at the start of the run() function, and use `capturedSwrKey` instead of `swrKeyRef.current` in the success handler. Also check that `capturedSwrKey === swrKeyRef.current` before writing to avoid the race entirely.
Was this helpful? React with 👍 or 👎 to provide feedback.
| function createRemoteCallId() { | ||
| for (let attempt = 0; attempt < MAX_REMOTE_CALL_SLOT_COUNT; attempt += 1) { | ||
| requestSequence = (requestSequence + 1) % MAX_REMOTE_CALL_SLOT_COUNT; | ||
| const callId = `${requestSequence}`; | ||
| if (!pendingRemoteCalls.has(callId)) { | ||
| return callId; | ||
| } | ||
| } | ||
|
|
||
| throw createTransportError('Too many pending background requests'); | ||
| } |
There was a problem hiding this comment.
🟡 Remote call slot ID reuse: ring-buffer requestSequence wraps to 0 which collides with existing callId="0" slot
In setupMainThreadBackgroundRunner.ts, createRemoteCallId() uses requestSequence = (requestSequence + 1) % MAX_REMOTE_CALL_SLOT_COUNT to generate call IDs. When requestSequence wraps from 511 back to 0, it produces callId = "0". This ID "0" can collide with a pending call that was dispatched when the sequence was previously at 0 (if that call hasn't timed out yet — the timeout is 30s). The for loop check !pendingRemoteCalls.has(callId) should catch this, but the issue is that ID "0" is a perfectly valid slot. With 512 slots and a 30s timeout, under heavy load all 512 slots could be occupied, causing createRemoteCallId() to throw 'Too many pending background requests' even though some are about to complete. This is an edge-case capacity issue rather than a logic error.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (!valueUr) { | ||
| throw new OneKeyLocalError('valueUr is required for animated QRCode'); | ||
| } | ||
| const { nextPart, encodeWhole } = airGapUrUtils.createAnimatedUREncoder({ | ||
| ur: valueUr, | ||
| maxFragmentLength: 30, | ||
| firstSeqNum: 0, | ||
| }); | ||
| if (process.env.NODE_ENV !== 'production') { | ||
| console.log('QRCode >>>> encodeWhole', encodeWhole()); | ||
| console.log(`\n\n ${encodeWhole().join('\n\n').toUpperCase()} \n\n`); | ||
| } | ||
| // const urEncoder = new UREncoder(UR.fromBuffer(Buffer.from(value))); | ||
| timerId = setInterval(() => { | ||
| const part = nextPart(); | ||
| setPartValue(part); | ||
| }, interval); | ||
| void (async () => { | ||
| const { airGapUrUtils } = await import('@onekeyhq/qr-wallet-sdk'); | ||
| // Guard against unmount/deps-change during the async import so we | ||
| // don't create an interval that no cleanup will ever reach. | ||
| if (cancelled) return; | ||
| const { nextPart, encodeWhole } = airGapUrUtils.createAnimatedUREncoder( | ||
| { | ||
| ur: valueUr, | ||
| maxFragmentLength: 30, | ||
| firstSeqNum: 0, | ||
| }, | ||
| ); | ||
| if (process.env.NODE_ENV !== 'production') { | ||
| console.log('QRCode >>>> encodeWhole', encodeWhole()); | ||
| console.log(`\n\n ${encodeWhole().join('\n\n').toUpperCase()} \n\n`); | ||
| } | ||
| timerId = setInterval(() => { | ||
| const part = nextPart(); | ||
| setPartValue(part); | ||
| }, interval); | ||
| })(); |
There was a problem hiding this comment.
🟡 QRCode animated mode leaks interval timer when the async import() rejects
In packages/components/src/content/QRCode/index.tsx, the animated QR code effect uses a dynamic import('@onekeyhq/qr-wallet-sdk'). If this dynamic import rejects (e.g., network error in a lazy-loaded segment), the promise chain .then() never calls setInterval, so timerId stays undefined. The cleanup function if (timerId) clearInterval(timerId) won't crash, but the unhandled rejection will propagate as an uncaught error since the IIFE void (async () => { ... })() swallows the rejected promise silently (void discards the return). The real issue is the missing .catch() — if the import fails, there's no error handling and no user feedback.
Code reference
At packages/components/src/content/QRCode/index.tsx:300-301, void (async () => { ... })() has no .catch() handler.
Was this helpful? React with 👍 or 👎 to provide feedback.
The common bundle now carries the prologue (seg-side logs confirm
`globalThis.__ONEKEY_STARTUP_PROFILE__` is true), but `[StartupProfile.js]`
summary lines never appear. That narrows to two silent return paths
inside `installStartupProfileJs` / `flushStartupProfileJs`:
1. `typeof g.__r !== 'function'` — Metro's require isn't on global in
this Hermes/union-build setup.
2. `stats.size === 0` — install ran but didn't wrap, so flush finds
an empty map and bails.
Emit one NativeLogger line at each decision point so the next CI run
reveals which branch silently returned — without that signal further
guesses are shots in the dark.
Temporary: remove once the root cause is identified + fixed.
Summary
BackgroundApi,ServiceBootstrap,service*instances, database access, and Jotai background state ownership intoapps/mobile/background.ts.main.jsbundle.hbc+background.bundle.common+main+background) with union build, segment manifest, MetadataV2 OTA support, and startup-graph budgets in CI.Scope Overview
This PR started as "split the background thread" but has grown into a broader startup and bundle architecture overhaul. The major subsystems landed here:
BackgroundApi+ services run in a dedicated JS runtime, main runtime is UI-onlycommon/main/background)LazyLoadPageto shrink the main runtime startup graphreact-native-aes-crypto/tcp-socket/zip-archivepatches with@onekeyfeTurboModule versions via npm aliasandroid-release-build-deploy.sh,ios-release-build-deploy.sh, build-bundle--platformflag, timing summarycore/chains/trontoshared/consts/chainConstsSubsystems 2–11 are not strictly required by "move BackgroundApi off the UI runtime", but they are all wired into the same bundle, same CI, and same release flow, so they ship together.
Intent & Context
Mobile had already introduced the native background-thread package, but the app still behaved like a single-JS-runtime system: the UI runtime instantiated the real
BackgroundApi, owned service bootstrap, and executed provider/event flows locally. That architecture drifted away from the extension model and made startup, wallet bootstrap, Jotai sync, and dApp/provider traffic compete directly with rendering and interaction on the same JS runtime.This PR completes the mobile background-thread split so mobile matches the extension mental model: the main runtime keeps UI-facing objects and proxy/relay responsibilities, while the background runtime owns the real service graph and long-lived background state. Because moving
BackgroundApioff the UI runtime exposes (and requires fixes for) a long tail of cross-runtime issues — bundle layout, WebEmbed bridge, Jotai storage lifecycle, cold-start UX, and release tooling — those subsystems also land in this PR.Root Cause
backgroundApiProxystill instantiated the realBackgroundApiin the native UI runtime for non-extension environments.ServiceBootstrap,service*,simpleDb,localDb, Jotai background state, and provider/event dispatch all shared the same JS runtime as React UI and WebView.Design Decisions
mainandbackgroundto make environment-dependent initialization deterministic.backgroundApiProxy, and transport/relay logic only.BackgroundApi, service bootstrap, service instances, DB access, and Jotai source of truth into the background runtime.runtime-readywithin the timeout, the transport rejects all queued and in-flight calls (fail loud) rather than silently falling back to a main-runtimeBackgroundApi, so startup regressions surface rather than hide behind a degraded-mode path.main.jsbundle.hbcandbackground.bundleas a versioned bundle pair in both built-in assets and OTA payloads, and extend this to a three-bundle (common/main/background) model to deduplicate shared modules.Architecture Before
flowchart LR subgraph R1["Single JS Runtime on Mobile"] UI["React UI / Providers"] Proxy["backgroundApiProxy"] API["BackgroundApi"] Services["ServiceBootstrap + service*"] DB["simpleDb / localDb"] Bridge["JsBridgeNativeHost / WebView bridge"] Jotai["Jotai background atoms"] Events["appEventBus"] end UI --> Proxy Proxy --> API API --> Services Services --> DB Bridge --> Proxy API --> Bridge API --> Jotai API --> EventsPrevious runtime shape
backgroundApiProxywas effectively a local direct-call layer in native UI mode.BackgroundApiand allservice*instances were created inside the same runtime that had to render the first screen and keep interactions responsive.Architecture After
flowchart LR subgraph Main["Main Runtime (UI)"] UI2["React UI / Providers"] Proxy2["backgroundApiProxy"] Queue["runtime-ready barrier\npending queue\nfallback-local"] Relay["provider relay\nappEvent relay"] Bridge2["JsBridgeNativeHost / WebView"] SSR["cold-start SSR cache"] end subgraph BG["Background Runtime"] API2["BackgroundApi"] Services2["ServiceBootstrap + service*"] DB2["simpleDb / localDb"] Jotai2["Jotai source of truth (MMKV)"] Events2["appEventBus handlers"] end UI2 --> Proxy2 Proxy2 --> Queue Bridge2 --> Relay Relay --> Queue Queue <--> RPC["SharedRPC request / response"] RPC --> API2 API2 --> Services2 Services2 --> DB2 API2 --> Jotai2 API2 --> Events2 Events2 --> RPC RPC --> Relay Relay --> Bridge2 SSR --> UI2New runtime shape
apps/mobile/background.tsbecomes the real mobile background entry and owns the full background service graph.BackgroundApiwhen native background-thread mode is enabled.SplashProvider,HardwareServiceProvider,Bootstrap, and other startup paths are held behind a ready barrier, then flushed in order afterruntime-ready.runtime-readywithinREADY_TIMEOUT_MS, the transport entersremote-brokenand rejects all queued and in-flight calls. There is no silent local fallback — the main runtime never holds a liveBackgroundApiin dual-thread mode.8081+8082), while release and OTA use a three-bundle (common+main+background) model with paired versioning.runtime-readyfor a visible UI.Startup Path To Home Container
This is the concrete startup chain from
apps/mobile/index.tsto the firstHomePageContainer/HomePageViewrender, and it is where the architectural win is most visible.Before: startup and home render path
Why the old path was expensive
Appwas enough to pullSplashProvider, which pulledbackgroundApiProxy, which eagerly created the realBackgroundApiin the main runtime.new BackgroundApi()immediately kicked offserviceBootstrap.init()and wired DB/service dependencies before the first real home render path had stabilized.SplashProvider.processPendingInstallTask(),HardwareServiceProvider.serviceHardware.init(),Bootstrapside effects, navigation mount, andHomePageViewdata fetches all landed on the same runtime.HomePageContainermounted, the main runtime had already paid for background graph construction and was still processing more startup-side service calls.After: startup and home render path
flowchart TD A1["apps/mobile/index.ts\nset runtime=main"] --> B1["setupMainThreadBackgroundRunner()\ninstall transport state + ready barrier"] A1 --> C1["require App"] N1["Native host lifecycle\n(iOS hostDidStart / Android ReactContext ready)"] --> N2["install SharedRPC in main runtime\nstart background runner"] N2 --> BG1["apps/mobile/background.ts\nset runtime=background"] BG1 --> BG2["create real BackgroundApi\nserviceBootstrap.init()\nDB/service wiring"] BG2 --> BG3["install RPC handler + emit runtime-ready"] C1 --> D1["App -> KitProvider module graph"] D1 --> E1["SplashProvider imports backgroundApiProxy\nproxy only, no local BackgroundApi creation"] D1 --> F1["HardwareServiceProvider mounts"] E1 --> G1["early startup calls"] F1 --> G1 G1 --> H1{"runtime-ready?"} H1 -- no --> Q1["queue in main runtime"] H1 -- yes --> R1["SharedRPC request to background runtime"] BG3 --> Q1 Q1 --> R1 D1 --> I1["SplashProvider allows Container mount\n(gated by HomePageReady)"] I1 --> J1["NavigationContainer -> RootNavigator -> TabNavigator"] J1 --> K1["Initial tab = Home"] K1 --> L1["Eager HomePageContainer (no Suspense frame)"] L1 --> S1["Home renders from cold-start SSR cache\n(tokenList / tokenListState / accountSelector)"] S1 --> M1["HomePageView real data requests\nover SharedRPC to warmed background runtime"] R1 --> M1 M1 --> P1["Main runtime stays focused on\nReact mount / navigation / layout / gestures"]Optimization points on the new path
backgroundApiInit()no longer runs eagerly in the native main runtime during theAppimport chain.serviceBootstrap.init(), DB wiring, and real service ownership move intoapps/mobile/background.ts, so the heavy bootstrap cost leaves the UI runtime.SplashProvider,HardwareServiceProvider, andBootstrapcan still fire early, but their calls either queue briefly or execute through SharedRPC instead of forcing main-runtime background graph creation.HomePageContainermounts eagerly and renders from the cold-start SSR cache while the background runtime is still booting, so there is no Suspense frame and no blank token list.HomePageReadyfires, ensuring the first visible frame is a real Home, not a skeleton.Subsystem Details
1. Native background runtime
apps/mobile/background.ts— background entry, owns realBackgroundApi, sets runtime identity before any shared/background imports.apps/mobile/src/backgroundThread/—rpcProtocol,runtimeReady,runtimeState,setupBackgroundThreadRPCHandler,setupMainThreadBackgroundRunner.BackgroundApiProxyBaseroutes native main-runtime calls through SharedRPC. If the background runtime fails to emitruntime-readywithinREADY_TIMEOUT_MS, all queued and in-flight calls are rejected hard — no local fallback, sincebackgroundApiInit.native-ui.tsis anull-returning stub and the main runtime deliberately does not carry a secondBackgroundApiinstance.hostDidStartand AndroidReactContext readyinstall SharedRPC and start the background runner at the right lifecycle boundary.ServiceBootstrapsplit into critical (startup) and deferred phases so the background runtime can emitruntime-readyearlier.2. Three-bundle split (
common/main/background)apps/mobile/scripts/unionBuild.js+unionBuildHelpers.jsbuild a union module graph across both runtimes and split it into three segments.apps/mobile/plugins/segmentPaths.js,segmentSerializer.js,segmentUtils.js,asyncRequireTpl.js,entryReachability.js— Metro plugins for segment allocation, async-require rewrite, reachability analysis.apps/mobile/bundle-groups.config.js— declarative segment group definitions.apps/mobile/src/splitBundle/—installProdBundleLoader,segmentManifest,runtimeInfo,nativeBridge,nativeBridgeBackground.commonthen the runtime-specific entry bundle; nativeSplitBundleLoadermodule loads segments on demand.MetadataV2in OTA carriesbundleFormatandcommonEntryso OTA can ship the three-bundle triple together.apps/mobile/scripts/build-release-background-bundle.jsproduces the releasebackground.bundle.LazyLoadPageso screens land in async segments rather than the main-runtime startup graph.3. Cold-start SSR cache
packages/shared/src/storage/instance/coldStartCacheMMKVInstance.ts— dedicated MMKV instance for cold-start state.packages/shared/src/consts/jotaiConsts.ts— centralizedcoldStartCacheKey, scoped per context store id and per provider.packages/shared/src/utils/swrCacheUtils.ts+usePromiseResultSWR integration — persist/restore async results keyed by input identity.packages/kit/src/states/jotai/contexts/tokenList/atoms.ts— cachedtokenListAtom,tokenListMapAtom,tokenListStateAtomwith final-ready flag.HomePageContaineris eager-imported;HomePageReadysignal gates Splash dismissal.packages/shared/src/lazyLoad/index.tsx+lazySdkLoader.ts— lightweight-charts, ethers, etc. become lazy SDK loaders so they don't load on cold start.4. Jotai AsyncStorage → MMKV migration
packages/shared/src/storage/instance/jotaiMMKVStorageInstance.ts— dedicated Jotai MMKV instance.packages/shared/src/storage/instance/syncStorageInstance.ts— refactored into acreateMMKVSyncStoragefactory used by both cold-start cache and Jotai storage.packages/kit-bg/src/states/jotai/jotaiStorage.ts—JotaiStorageNativeMMKVwith per-key migration from AsyncStorage. Migration runs only on the background runtime; the main runtime is read-only until themigration-completeprobe is set. Failures keep the flag unset so the next boot retries.jotaiStorage.test.ts+syncStorageInstance.test.ts.5. Startup-graph budget CI
apps/mobile/scripts/analyze-startup-graph.js— extracts the main-runtime startup module graph.apps/mobile/scripts/check-startup-graph-budget.js— enforces per-segment size and module-count budgets; forbids known background-only modules on the main runtime.apps/mobile/scripts/check-bundle-architecture.js— validates three-bundle segment allocation and cross-runtime uniqueness..github/workflows/startup-graph-budget.yml— runs on PRs..github/workflows/bundle-architecture-check.yml— runs daily.forbiddenInStartupguards with a background-entry exemption.6. Lazy route loading
packages/kit/src/views/AssetList/router/index.ts,ChainSelector/router/index.ts,Home/router/index.ts,ManualBackup/router/index.tsx,Perp/router/index.ts,Send/router/index.ts,Swap/router/index.tsx,TestModal/router/index.ts,Setting/router/*,Developer/router.empty.ts, etc. — direct screen imports replaced withLazyLoadPage.ErrorBoundaryadded toLazyLoadPageso segment load failures are recoverable.7. WebEmbed dual-runtime bridge fix
packages/kit/src/components/WebViewWebEmbed/+packages/kit-bg/src/webembeds/instance/webembedApiProxy.ts— WebEmbed bridge calls flow through reverse RPC so background can call back into WebEmbed hosted in the main runtime.packages/shared/src/logger/scopes/app/scenes/webembed.ts— diagnostic logging for cross-runtime WebEmbed calls.docs/plans/2026-04-06-fix-webembed-dual-thread.md— design doc.8. Legacy RN TurboModule replacement via npm alias
react-native-aes-crypto/react-native-tcp-socket/react-native-zip-archivepatches removed.@onekeyfeTurboModule versions viapackage.jsonnpm alias — call sites stay the same.development/scripts/minimum-release-ageupdated to resolve npm alias packages correctly.development/scripts/upgrade-modules.jsupdated to match the new layout.react-native+0.81.5.patchupdated,expo+54.0.26.patchadded.9. Release build scripts and tooling
development/scripts/android-release-build-deploy.shandios-release-build-deploy.sh— reproducible release builds with deploy hooks.apps/mobile/build-bundle.jsgains a--platformflag and a timing/total-time summary.apps/mobile/e2e/bundleUpdate.harness.ts+jest.harness.config.mjs— harness coverage for iOS bundle-pair loading.10. Huawei flavor removal
11. Tron constants refactor
packages/core/src/chains/tron/constants.ts→packages/shared/src/consts/chainConsts.ts.chainConstsso they can be imported from shared without pulling the Tron core module into the main-runtime startup graph.Changes Detail
apps/mobile/background.tsas the dedicated background runtime entry and set runtime identity before any shared/background imports.platformEnv/ build-time env withENABLE_NATIVE_BACKGROUND_THREAD,isNativeMainThread, andisNativeBackgroundThread-driven behavior.BackgroundApiProxyBaseso native main runtime calls flow through SharedRPC transport instead of direct local service execution when background-thread mode is enabled.BackgroundApicreation in native main runtime; local creation is used only in non-background-thread environments (desktop/web, extension, or when the feature flag is off). In dual-thread native, the main runtime never holds a localBackgroundApi— the native-ui factory stub returnsnulland the proxy routes everything through SharedRPC.common,main, andbackgroundbundles.main.jsbundle.hbcandbackground.bundlemove together and are guarded byrequiresBackgroundBundle/backgroundProtocolVersion, withcommon.jsbundleshared between the two.usePromiseResult, gated byHomePageReady.LazyLoadPageto shrink the main runtime startup graph.--platformflag forbuild-bundle.js.shared/consts/chainConsts.Expected Performance Impact
These are the expected gains from the completed architecture, based on moving service bootstrap, database work, provider flows, and background state ownership off the UI runtime, combined with the three-bundle split and cold-start SSR cache:
BackgroundApi,ServiceBootstrap, and background-side state graph no longer initialize in the UI runtime, and background-only modules no longer ship in the main bundle.tokensStartMs: expected improvement of roughly 15% to 30% because Home refresh no longer competes as directly with background bootstrap and service initialization on the same JS runtime, and Home renders from the cold-start SSR cache.tokensSpanMs: expected improvement of roughly 8% to 18% from lower JS contention and fewer long blocking tasks on the main runtime while background services handle data and event work off-thread.functionCallCount: expected reduction of roughly 30% to 50% because service bootstrap, atom ownership, provider handling, and event fan-out move into the background runtime.Risk Assessment
backgroundProtocolVersionmismatch handling.Test plan
Background runtime
8081main Metro and8082background Metro both run; background runtime starts; pre-ready requests flush afterruntime-ready.SplashProvider,HardwareServiceProvider, andBootstrapdo not fail beforeruntime-readyand recover correctly after the queue flush.serviceBootstrap.init(),serviceSetting.getInstanceId(),serviceNetwork.getAllNetworks(),serviceAccount.getWallets(),serviceAppUpdate.refreshUpdateStatus().accountChanged,chainChanged, reconnect, and bridge focus changes.Three-bundle split + OTA
common.jsbundle,main.jsbundle.hbc, andbackground.bundleand can boot without Metro.background.bundle,backgroundProtocolVersionmismatch, andbundleFormat/commonEntryhandling.yarn check:bundle-architecturepasses locally and in daily CI.yarn check:startup-graph-budgetpasses with no forbidden main-runtime modules.Cold-start SSR cache
Jotai MMKV migration
migration-completeunset; next boot retries.resetAppclears MMKV per-key storage.Release tooling + platform parity
development/scripts/android-release-build-deploy.shandios-release-build-deploy.shproduce signed release artifacts with correct bundle layout.yarn build-bundle --platform iosand--platform androidproduce the expected three-bundle layout and timing summary.minimum-release-agecheck andupgrade-modules.js.