Handle race when coloring nodes concurrently as both green and red#151509
Handle race when coloring nodes concurrently as both green and red#151509rust-bors[bot] merged 1 commit intorust-lang:mainfrom
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Handle race when coloring nodes concurrently as both green and red
|
Looks fairly neutral locally.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here's what's bugging me. I'm under the impression that this PR allows a node to change colour from green to red. It should not. A node has a single colour and should never change. So my recommendation would be:
|
This comment has been minimized.
This comment has been minimized.
Wouldn't #150156 already address these concerns? |
|
Finished benchmarking commit (f9b6979): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.9%, secondary -5.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 473.794s -> 471.369s (-0.51%) |
No, that's the current state of rustc. This PR prevents that. |
|
I honestly feel like @Zoxc often deals with his own meaning of compiler's code and not the code's actual behavior. It comes up when I try to convince him about anything. |
|
Would demonstrating what the code does help resolve this discussion? |
|
As discussed in Solving big #141540 issue and hunt for bad metadata files, I think the trait selection cache can leave nodes uncolored after execution. This means that we're not able to tell if dependencies are all green in all cases and #150156 would be insufficient to fix the race. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
r? @Zalathar |
|
@bors r=zetanumbers,petrochenkov |
This comment has been minimized.
This comment has been minimized.
Handle race when coloring nodes concurrently as both green and red This fixes a race where a duplicate dep node gets written to the dep graph if a node was marked as green and promoted during execution, then marked as red after execution. This can occur when a `no_hash` query A depends on a query B which cannot be forced so it was not colored when starting execution of query A. During the execution of query A it will execute query B and color it green. Before A finishes another thread tries to mark A green, this time succeeding as B is now green, and A gets promoted and written to metadata. Execution of A then finishes and because it's `no_hash` we assume the result changed and thus we color the node again, now as red and write it to metadata again. This doesn't happen with non-`no_hash` queries as they will be green if all their dependencies are green. This changes the code coloring nodes red to also use `compare_exchange` to deal with this race ensuring that the coloring of nodes only happens once. Fixes #150018 Fixes #142778 Fixes #141540
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
💔 Test for 0c09117 failed: CI. Failed job:
|
|
@bors retry |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing d00ba92 (parent) -> 9e79395 (this PR) Test differencesNo test diffs found Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 9e79395f92bff6a8f536430e42a4beae69f60ff8 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (9e79395): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary -1.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 1.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 474.836s -> 476.965s (0.45%) |
Rust finally merged a fix to the "trying to encode a dep node twice" ICE we seem to most frequently run into, triggered by the rustc parallel frontend feature: rust-lang/rust#151509 - @mischnic commented on the issue: rust-lang/rust#150018 (comment) - @mischnic just updated our toolchain last week, but I'd like to get this fix in. CI Job for build-and-release that covers all platforms: https://github.com/vercel/next.js/actions/runs/22156881743 In the process of this upgrade, I found and reported rust-lang/rust#152735 upstream (fixed `nightly-2026-02-18`).
This fixes a race where a duplicate dep node gets written to the dep graph if a node was marked as green and promoted during execution, then marked as red after execution.
This can occur when a
no_hashquery A depends on a query B which cannot be forced so it was not colored when starting execution of query A. During the execution of query A it will execute query B and color it green. Before A finishes another thread tries to mark A green, this time succeeding as B is now green, and A gets promoted and written to metadata. Execution of A then finishes and because it'sno_hashwe assume the result changed and thus we color the node again, now as red and write it to metadata again. This doesn't happen with non-no_hashqueries as they will be green if all their dependencies are green.This changes the code coloring nodes red to also use
compare_exchangeto deal with this race ensuring that the coloring of nodes only happens once.Fixes #150018
Fixes #142778
Fixes #141540