Skip to content

Defer descendant propagation to batch pass at end of resolve()#743

Draft
st0012 wants to merge 3 commits intomainfrom
defer-descendant-propagation
Draft

Defer descendant propagation to batch pass at end of resolve()#743
st0012 wants to merge 3 commits intomainfrom
defer-descendant-propagation

Conversation

@st0012
Copy link
Copy Markdown
Member

@st0012 st0012 commented Apr 15, 2026

Summary

  • Defer all descendant propagation from during recursive linearize_ancestors to a single batch pass at the end of resolve()
  • Descendants are not read during the resolution loop — only during invalidation (graph.rs:1118, 1181) and MCP queries (get_descendants) — both of which happen after resolve() returns, so deferral is safe
  • Removes propagate_descendants() method, descendants field from LinearizationContext, and the per-declaration descendant insertion loop

Before / After

Measured on Shopify core (127,216 files, 1,470,515 declarations, 1,634,734 definitions).

Metric Before After Change
Resolution 42.8s 33.9s -20.8%
Total pipeline 52.3s 43.0s -17.8%
Declarations 1,470,515 1,470,515 identical
Definitions 1,634,734 1,634,734 identical
Orphan rate 0.3% 0.3% identical

Why this works

During recursive linearization, every cache hit triggered propagate_descendants() — an O(ancestors × descendants) nested loop that inserted the current declaration (and its transitive descendants) into every ancestor's descendant set. With 155k classes mostly inheriting from Object, this meant ~155k calls each iterating Object's full ancestor chain.

Since descendants are not read during the resolution loop — only during incremental invalidation and MCP queries, both of which happen after resolve() returns — we can safely collect flat (descendant_id, ancestor_id) pairs during linearization and batch-insert them after the resolution loop completes. This eliminates the hot nested loop entirely.

During linearize_ancestors, descendant tracking was the #1 performance
bottleneck: propagate_descendants ran O(ancestors × descendants) on
every cache hit, and per-declaration descendant tracking added
O(accumulated_descendants) work at each recursion level.

Since descendants are write-only during resolution (read only during
invalidation at graph.rs:1118, 1181), we can safely defer all
descendant propagation to a single batch pass after the resolution
loop completes. This eliminates:

1. The propagate_descendants() call on every cache hit
2. The per-declaration descendant insertion loop during recursion
3. The descendants field from LinearizationContext entirely

Instead, after building each ancestor chain we record flat
(descendant_id, ancestor_id) pairs, then batch-process all
relationships at the end of resolve().

Resolution on Shopify core: ~42s → ~34s (20% faster).
@st0012 st0012 requested a review from a team as a code owner April 15, 2026 20:05
@st0012 st0012 marked this pull request as draft April 15, 2026 20:08
st0012 added 2 commits April 15, 2026 21:09
- Add backticks around `descendant_id`, `ancestor_id`, `resolve()` in
  doc comment to satisfy clippy doc lint
- Sort descendant assertions in test_descendants since IdentityHashSet
  iteration order is non-deterministic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant