Commit Graph

1626 Commits

Author SHA1 Message Date
Matthias Krüger
3ae7ab698f
Rollup merge of #115291 - cjgillot:dest-prop-save, r=JakobDegen
Save liveness results for DestinationPropagation

`DestinationPropagation` needs to verify that merge candidates do not conflict with each other. This is done by verifying that a local is not live when its counterpart is written to.

To get the liveness information, the pass runs `MaybeLiveLocals` dataflow analysis repeatedly, once for each propagation round. This is quite costly, and the main driver for the perf impact on `ucd` and `diesel`. (See https://github.com/rust-lang/rust/pull/115105#issuecomment-1689205908)

In order to mitigate this cost, this PR proposes to save the result of the analysis into a `SparseIntervalMatrix`, and mirror merges of locals into that matrix: `liveness(destination) := liveness(destination) union liveness(source)`.

<details>
<summary>Proof</summary>

We denote by `'` all the quantities of the transformed program. Let $\varphi$ be a mapping of locals, which maps `source` to `destination`, and is identity otherwise. The exact liveness set after a statement is $out'(statement)$, and the proposed liveness set is $\varphi(out(statement))$.

Consider a statement. Suppose that the output state verifies $out' \subset phi(out)$. We want to prove that $in' \subset \varphi(in)$ where $in = (out - kill) \cup gen$, and conclude by induction.

We have 2 cases: either that statement is kept with locals renumbered by $\varphi$, or it is a tautological assignment and it removed.

1. If the statement is kept: the gen-set and the kill-set of $statement' = \varphi(statement)$ are $gen' = \varphi(gen)$ and $kill' = \varphi(kill)$ exactly.
From soundness requirement 3, $\varphi(in)$ is disjoint from $\varphi(kill)$.
This implies that $\varphi(out - kill)$ is disjoint from $\varphi(kill)$, and so $\varphi(out - kill) = \varphi(out) - \varphi(kill)$. Then $\varphi(in) = (\varphi(out) - \varphi(kill)) \cup \varphi(gen) = (\varphi(out) - kill') \cup gen'$.
We can conclude that $out' \subset \varphi(out) \implies in' \subset \varphi(in)$.

2. If the statement is removed. As $\varphi(statement)$ is a tautological assignment, we know that $\varphi(gen) = \varphi(kill) = \\{ destination \\}$, while $gen' = kill' = \emptyset$. So $\varphi(in) = \varphi(out) \cup \\{ destination \\}$. Then $in' = out' \subset out \subset \varphi(in)$.

By recursion, we can conclude by that $in' \subset \varphi(in)$ everywhere.
</details>

This approximate liveness results is only suboptimal if there are locals that fully disappear from the CFG due to an assignment cycle. These cases are quite unlikely, so we do not bother with them.

This change allows to reduce the perf impact of DestinationPropagation by half on diesel and ucd (https://github.com/rust-lang/rust/pull/115105#issuecomment-1694701904).

cc ````@JakobDegen````
2024-01-17 20:21:19 +01:00
bors
52790a98e5 Auto merge of #119670 - cjgillot:gvn-arithmetic, r=oli-obk
Fold arithmetic identities in GVN

Extracted from https://github.com/rust-lang/rust/pull/111344

This PR implements a few arithmetic folds for unary and binary operations.
This should take care of the missed optimizations introduced by https://github.com/rust-lang/rust/pull/116012.
2024-01-17 11:46:49 +00:00
bors
16f4b02dd8 Auto merge of #119922 - nnethercote:fix-Diag-code-is_lint, r=oli-obk
Rework how diagnostic lints are stored.

`Diagnostic::code` has the type `DiagnosticId`, which has `Error` and
`Lint` variants. Plus `Diagnostic::is_lint` is a bool, which should be
redundant w.r.t. `Diagnostic::code`.

Seems simple. Except it's possible for a lint to have an error code, in
which case its `code` field is recorded as `Error`, and `is_lint` is
required to indicate that it's a lint. This is what happens with
`derive(LintDiagnostic)` lints. Which means those lints don't have a
lint name or a `has_future_breakage` field because those are stored in
the `DiagnosticId::Lint`.

It's all a bit messy and confused and seems unintentional.

This commit:
- removes `DiagnosticId`;
- changes `Diagnostic::code` to `Option<String>`, which means both
  errors and lints can straightforwardly have an error code;
- changes `Diagnostic::is_lint` to `Option<IsLint>`, where `IsLint` is a
  new type containing a lint name and a `has_future_breakage` bool, so
  all lints can have those, error code or not.

r? `@oli-obk`
2024-01-17 07:33:52 +00:00
Camille GILLOT
0cc2102c4a Expand match over binops. 2024-01-16 22:34:04 +00:00
Camille GILLOT
166fe54eba Explain side-effects from simplify_operand. 2024-01-16 22:34:04 +00:00
Camille GILLOT
22ed51e136 Do not read a scalar on a non-scalar layout. 2024-01-16 22:32:48 +00:00
Camille GILLOT
3c48243b6f Simplify Len. 2024-01-16 22:20:54 +00:00
Camille GILLOT
5fc23ad8e6 Simplify unary operations. 2024-01-16 22:20:54 +00:00
Camille GILLOT
666030c51b Simplify binary ops. 2024-01-16 22:20:53 +00:00
bors
92f2e0aa62 Auto merge of #116520 - Enselic:large-copy-into-fn, r=oli-obk
large_assignments: Lint on specific large args passed to functions

Requires lowering function call arg spans down to MIR, which is done in the second commit.

Part of #83518

Also see
* https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/arg.20Spans.20for.20TerminatorKind.3A.3ACall.3F
* https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/move_size_limit.20lint

r? `@oli-obk` (E-mentor)
2024-01-16 19:33:14 +00:00
bors
fa0dc208d0 Auto merge of #119672 - cjgillot:dse-sandwich, r=oli-obk
Sandwich MIR optimizations between DSE.

This PR reorders MIR optimization passes in an attempt to increase their efficiency.

- Stop running CopyProp before GVN, it's useless as GVN will do the same thing anyway. Instead, we perform CopyProp at the end of the pipeline, to ensure we do not emit copy/move chains.
- Run DSE before GVN, as it increases the probability to have single-assignment locals.
- Run DSE after the final CopyProp to turn copies into moves.

r? `@ghost`
2024-01-16 11:34:16 +00:00
bors
f9c2421a2a Auto merge of #119439 - cjgillot:gvn-faster, r=oli-obk
Avoid some redundant work in GVN

The first 2 commits are about reducing the perf effect.

Third commit avoids doing redundant work: is a local is SSA, it already has been simplified, and the resulting value is in `self.locals`. No need to call any code on it.

The last commit avoids removing some storage statements.

r? wg-mir-opt
2024-01-16 05:17:49 +00:00
Martin Nordholts
16ba56c242 compiler: Lower fn call arg spans down to MIR
To enable improved accuracy of diagnostics in upcoming commits.
2024-01-15 19:07:11 +01:00
Nicholas Nethercote
d71f535a6f Rework how diagnostic lints are stored.
`Diagnostic::code` has the type `DiagnosticId`, which has `Error` and
`Lint` variants. Plus `Diagnostic::is_lint` is a bool, which should be
redundant w.r.t. `Diagnostic::code`.

Seems simple. Except it's possible for a lint to have an error code, in
which case its `code` field is recorded as `Error`, and `is_lint` is
required to indicate that it's a lint. This is what happens with
`derive(LintDiagnostic)` lints. Which means those lints don't have a
lint name or a `has_future_breakage` field because those are stored in
the `DiagnosticId::Lint`.

It's all a bit messy and confused and seems unintentional.

This commit:
- removes `DiagnosticId`;
- changes `Diagnostic::code` to `Option<String>`, which means both
  errors and lints can straightforwardly have an error code;
- changes `Diagnostic::is_lint` to `Option<IsLint>`, where `IsLint` is a
  new type containing a lint name and a `has_future_breakage` bool, so
  all lints can have those, error code or not.
2024-01-14 14:04:25 +11:00
Zalathar
5eae9452b6 coverage: Simplify computing successors in the BCB graph 2024-01-14 12:11:26 +11:00
Zalathar
867950f8c6 coverage: Move helper add_basic_coverage_block into a local closure
This also switches from `split_off(0)` to `std::mem::take` when emptying the
accumulated list of blocks, because `split_off(0)` handles capacity in a way
that is unintuitive when used in a loop.
2024-01-14 12:11:25 +11:00
Zalathar
229d0983b5 coverage: Simplify the loop that combines blocks into BCBs
The old loop had two separate places where it would flush the acumulated list
of straight-line blocks into a new BCB. One occurred at the start of the loop
body when the current block couldn't be chained into, and the other occurred at
the end of the loop body when the current block couldn't be chained from.

The latter check can be hoisted to the start of the loop body by making it
examine the previous block (which has added itself to the list) instead of the
current block. With that done, we can combine the two separate flushes into one
flush with two possible trigger conditions.
2024-01-14 12:11:25 +11:00
Zalathar
c412cd4bc2 coverage: Indicate whether a block's successors allow BCB chaining 2024-01-14 12:11:25 +11:00
Zalathar
6d1c396399 coverage: Determine a block's successors from just the terminator
Filtering out unreachable successors is only needed by the main graph traversal
loop, so we can move the filtering step into that loop instead, eliminating the
need to pass the MIR body into `bcb_filtered_successors`.
2024-01-14 12:11:25 +11:00
Matthias Krüger
8294356a5d
Rollup merge of #119842 - Zalathar:kind, r=oli-obk
coverage: Add enums to accommodate other kinds of coverage mappings

Extracted from  #118305.

LLVM supports several different kinds of coverage mapping regions, but currently we only ever emit ordinary “code” regions.  This PR performs the plumbing required to add other kinds of regions as enum variants, but does not add any specific variants other than `Code`.

The main motivation for this change is branch coverage, but it will also allow separate experimentation with gap regions and skipped regions, which might help in producing more accurate and useful coverage reports.

---

``@rustbot`` label +A-code-coverage
2024-01-11 19:42:51 +01:00
Camille GILLOT
bc35ee41fa Do not run simplify_locals inside DSE.
The full pass is run short after.
2024-01-11 09:58:22 +00:00
Camille GILLOT
0aedd6e86f Sandwich MIR optimizations between DSE. 2024-01-11 09:58:19 +00:00
Zalathar
124fff0777 coverage: Add enums to accommodate other kinds of coverage mappings 2024-01-11 16:43:12 +11:00
Zalathar
c5932182ad coverage: Store extracted spans as a flat list of mappings
This is less elegant in some ways, since we no longer visit a BCB's spans as a
batch, but will make it much easier to add support for other kinds of coverage
mapping regions (e.g. branch regions or gap regions).
2024-01-11 16:43:01 +11:00
Zalathar
8f98b54a7e coverage: Extract helper function term_for_bcb 2024-01-11 16:07:38 +11:00
bors
3a6bf351a3 Auto merge of #119677 - cjgillot:early-cfg-opt, r=oli-obk
Reorder early post-inlining passes.

`RemoveZsts`, `RemoveUnneededDrops` and `UninhabitedEnumBranching` only depend on types, so they should be executed together early after MIR inlining introduces those types.

This does not change the end-result, but this makes the pipeline a bit more consistent.
2024-01-11 04:09:07 +00:00
Guillaume Gomez
9b905417f5
Rollup merge of #119699 - cjgillot:simplify-unreachable, r=oli-obk
Merge dead bb pruning and unreachable bb deduplication.

Both routines share the same basic structure: iterate on all bbs to identify work, and then renumber bbs.

We can do both at once.
2024-01-09 13:23:18 +01:00
Guillaume Gomez
72fdaf52e0
Rollup merge of #119668 - cjgillot:transform-promote, r=oli-obk
Simplify implementation of MIR promotion

Non-functional changes.
Best read ignoring whitespace.
2024-01-09 13:23:17 +01:00
Matthias Krüger
70e3f8d240
Rollup merge of #119033 - Zalathar:unicode, r=davidtwco
coverage: `llvm-cov` expects column numbers to be bytes, not code points

Normally the compiler emits column numbers as a 1-based number of Unicode code points.

But when we embed coverage mappings for `-Cinstrument-coverage`, those mappings will ultimately be read by the `llvm-cov` tool. That tool assumes that column numbers are 1-based numbers of *bytes*, and relies on that assumption when slicing up source code to apply highlighting (in HTML reports, and in text-based reports with colour).

For the very common case of all-ASCII source code, bytes and code points are the same, so the difference isn't noticeable. But for code that contains non-ASCII characters, emitting column numbers as code points will result in `llvm-cov` slicing strings in the wrong places, producing mangled output or fatal errors.

(See https://github.com/taiki-e/cargo-llvm-cov/issues/275 as an example of what can go wrong.)
2024-01-09 00:19:33 +01:00
Camille GILLOT
5d6463c26c Make match exhaustive. 2024-01-08 22:42:07 +00:00
Camille GILLOT
cae0dc2833 Simplify code flow. 2024-01-08 22:42:07 +00:00
Camille GILLOT
8356802862 Move promote_consts back to rustc_mir_transform. 2024-01-08 22:42:07 +00:00
Zalathar
6971e9332d coverage: llvm-cov expects column numbers to be bytes, not code points 2024-01-08 21:58:46 +11:00
Zalathar
88f5759ace coverage: Allow make_code_region to fail 2024-01-08 21:43:22 +11:00
Nicholas Nethercote
2d91c6d1bf Remove DiagnosticBuilderState.
Currently it's used for two dynamic checks:
- When a diagnostic is emitted, has it been emitted before?
- When a diagnostic is dropped, has it been emitted/cancelled?

The first check is no longer need, because `emit` is consuming, so it's
impossible to emit a `DiagnosticBuilder` twice. The second check is
still needed.

This commit replaces `DiagnosticBuilderState` with a simpler
`Option<Box<Diagnostic>>`, which is enough for the second check:
functions like `emit` and `cancel` can take the `Diagnostic` and then
`drop` can check that the `Diagnostic` was taken.

The `DiagCtxt` reference from `DiagnosticBuilderState` is now stored as
its own field, removing the need for the `dcx` method.

As well as making the code shorter and simpler, the commit removes:
- One (deprecated) `ErrorGuaranteed::unchecked_claim_error_was_emitted`
  call.
- Two `FIXME(eddyb)` comments that are no longer relevant.
- The use of a dummy `Diagnostic` in `into_diagnostic`.

Nice!
2024-01-08 16:18:54 +11:00
Camille GILLOT
da235ce92d Do not recompute liveness for DestinationPropagation. 2024-01-07 20:07:35 +00:00
bors
75c68cfd2b Auto merge of #119675 - cjgillot:set-no-discriminant, r=tmiasko
Skip threading over no-op SetDiscriminant.

Fixes https://github.com/rust-lang/rust/issues/119674
2024-01-07 15:34:05 +00:00
Camille GILLOT
4071572cb4 Merge dead bb pruning and unreachable bb deduplication. 2024-01-07 15:12:10 +00:00
Camille GILLOT
7e39100586 Avoid recording no-op replacements. 2024-01-07 13:54:05 +00:00
Camille GILLOT
4ee01faaf0 Do not re-simplify SSA locals. 2024-01-07 13:54:05 +00:00
Camille GILLOT
e26c9a42c6 Cache feature unsized locals + use smallvec. 2024-01-07 13:54:05 +00:00
Camille GILLOT
05c4eef95e Make rev_locals a vec. 2024-01-07 13:54:05 +00:00
Camille GILLOT
a8c4d43cb1 Reorder early post-inlining passes. 2024-01-07 01:42:57 +00:00
Camille GILLOT
41eb9a49af Skip threading over no-op SetDiscriminant. 2024-01-07 00:28:20 +00:00
Martin Nordholts
6d8fb57d1a rustc_mir_transform: Enforce rustc::potential_query_instability lint
Stop allowing `rustc::potential_query_instability` on all of
rustc_mir_transform and instead allow it on a case-by-case basis if it
is safe to do so. In this particular crate, all instances were safe to
allow.
2024-01-06 19:09:04 +01:00
Matthias Krüger
909f2b63a3
Rollup merge of #119591 - Enselic:DestinationPropagation-stable, r=cjgillot
rustc_mir_transform: Make DestinationPropagation stable for queries

By using `FxIndexMap` instead of `FxHashMap`, so that the order of visiting of locals is deterministic.

We also need to bless
`copy_propagation_arg.foo.DestinationPropagation.panic*.diff`. Do not review the diff of the diff. Instead look at the diff files before and after this commit. Both before and after this commit, 3 statements are replaced with nop. It's just that due to change in ordering, different statements are replaced. But the net result is the same. In other words, compare this diff (before fix):
* 090d5eac72/tests/mir-opt/dest-prop/copy_propagation_arg.foo.DestinationPropagation.panic-unwind.diff

With this diff (after fix):
* f603babd63/tests/mir-opt/dest-prop/copy_propagation_arg.foo.DestinationPropagation.panic-unwind.diff

and you can see that both before and after the fix, we replace 3 statements with `nop`s.

I find it _slightly_ surprising that the test this PR affects did not previously fail spuriously due to the indeterminism of `FxHashMap`, but I guess in can be explained with the predictability of small `FxHashMap`s with `usize` (`Local`) keys, or something along those lines.

This should fix [this](https://github.com/rust-lang/rust/pull/119252#discussion_r1436101791) comment, but I wanted to make a separate PR for this fix for a simpler development and review process.

Part of https://github.com/rust-lang/rust/issues/84447 which is E-help-wanted.

r? `@cjgillot` who is reviewer for the highly related PR https://github.com/rust-lang/rust/pull/119252.
2024-01-06 16:07:47 +01:00
bors
efb3f11087 Auto merge of #119499 - cjgillot:dtm-opt, r=nnethercote
Two small bitset optimisations
2024-01-06 11:54:15 +00:00
bors
aa7e9f21e9 Auto merge of #119648 - compiler-errors:rollup-42inxd8, r=compiler-errors
Rollup of 9 pull requests

Successful merges:

 - #119208 (coverage: Hoist some complex code out of the main span refinement loop)
 - #119216 (Use diagnostic namespace in stdlib)
 - #119414 (bootstrap: Move -Clto= setting from Rustc::run to rustc_cargo)
 - #119420 (Handle ForeignItem as TAIT scope.)
 - #119468 (rustdoc-search: tighter encoding for f index)
 - #119628 (remove duplicate test)
 - #119638 (fix cyle error when suggesting to use associated function instead of constructor)
 - #119640 (library: Fix warnings in rtstartup)
 - #119642 (library: Fix a symlink test failing on Windows)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-01-06 06:00:27 +00:00
Michael Goulet
bacddd3e5d
Rollup merge of #119208 - Zalathar:hoist, r=WaffleLapkin,Swatinem
coverage: Hoist some complex code out of the main span refinement loop

The span refinement loop in `spans.rs` takes the spans that have been extracted from MIR, and modifies them to produce more helpful output in coverage reports.

It is also one of the most complicated pieces of code in the coverage instrumentor. It has an abundance of moving pieces that make it difficult to understand, and most attempts to modify it end up accidentally changing its behaviour in unacceptable ways.

This PR nevertheless tries to make a dent in it by hoisting two pieces of special-case logic out of the main loop, and into separate preprocessing passes. Coverage tests show that the resulting mappings are *almost* identical, with all known differences being unimportant.

This should hopefully unlock further simplifications to the refinement loop, since it now has fewer edge cases to worry about.
2024-01-05 23:41:41 -05:00
bors
d62f05b842 Auto merge of #119459 - cjgillot:inline-mir-utils, r=compiler-errors
Inline a few utility functions around MIR

Most of them are small enough to benefit from inlining.
2024-01-06 04:01:09 +00:00