Commit Graph

62 Commits

Author SHA1 Message Date
bors
e4a6032706 Auto merge of #85903 - bjorn3:rustc_serialize_cleanup, r=varkor
Remove unused functions and arguments from rustc_serialize
2021-06-07 14:40:26 +00:00
bjorn3
8176ab8bc1 Revert "Merge CrateDisambiguator into StableCrateId"
This reverts commit d0ec85d3fb.
2021-06-07 10:37:45 +02:00
bjorn3
a2c4affe86 Remove unused functions and arguments from rustc_serialize 2021-06-01 19:29:11 +02:00
bjorn3
312f964478 Remove unused feature gates 2021-05-31 13:55:43 +02:00
bjorn3
9de82d7611 Use allow_internal_unstable more in rustc_index 2021-05-31 12:13:47 +02:00
bjorn3
d0ec85d3fb Merge CrateDisambiguator into StableCrateId 2021-05-30 12:51:34 +02:00
Jacob Pratt
bc2f0fb5a9
Specialize implementations
Implementations in stdlib are now optimized as they were before.
2021-05-26 18:07:09 -04:00
bors
e1ff91f439 Auto merge of #83813 - cbeuw:remap-std, r=michaelwoerister
Fix `--remap-path-prefix` not correctly remapping `rust-src` component paths and unify handling of path mapping with virtualized paths

This PR fixes #73167 ("Binaries end up containing path to the rust-src component despite `--remap-path-prefix`") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

`RealFileName::Named` introduced in #72767 is now renamed as `LocalPath`, because this variant wraps a (most likely) valid local filesystem path.

`RealFileName::Devirtualized` is renamed as `Remapped` to be used for remapped path from a real path via `--remap-path-prefix` argument, as well as real path inferred from a virtualized (during compiler bootstrapping) `/rustc/...` path. The `local_path` field is now an `Option<PathBuf>`, as it will be set to `None` before serialisation, so it never reaches any build output. Attempting to serialise a non-`None` `local_path` will cause an assertion faliure.

When a path is remapped, a `RealFileName::Remapped` variant is created. The original path is preserved in `local_path` field and the remapped path is saved in `virtual_name` field. Previously, the `local_path` is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

`rustc_span::SourceFile`'s fields `unmapped_path` (introduced by #44940) and `name_was_remapped` (introduced by #41508 when `--remap-path-prefix` feature originally added) are removed, as these two pieces of information can be inferred from the `name` field: if it's anything other than a `FileName::Real(_)`, or if it is a `FileName::Real(RealFileName::LocalPath(_))`, then clearly `name_was_remapped` would've been false and `unmapped_path` would've been `None`. If it is a `FileName::Real(RealFileName::Remapped{local_path, virtual_name})`, then `name_was_remapped` would've been true and `unmapped_path` would've been `Some(local_path)`.

cc `@eddyb` who implemented `/rustc/...` path devirtualisation
2021-05-12 11:05:56 +00:00
Aaron Hill
f916b0474a
Implement span quoting for proc-macros
This PR implements span quoting, allowing proc-macros to produce spans
pointing *into their own crate*. This is used by the unstable
`proc_macro::quote!` macro, allowing us to get error messages like this:

```
error[E0412]: cannot find type `MissingType` in this scope
  --> $DIR/auxiliary/span-from-proc-macro.rs:37:20
   |
LL | pub fn error_from_attribute(_args: TokenStream, _input: TokenStream) -> TokenStream {
   | ----------------------------------------------------------------------------------- in this expansion of procedural macro `#[error_from_attribute]`
...
LL |             field: MissingType
   |                    ^^^^^^^^^^^ not found in this scope
   |
  ::: $DIR/span-from-proc-macro.rs:8:1
   |
LL | #[error_from_attribute]
   | ----------------------- in this macro invocation
```

Here, `MissingType` occurs inside the implementation of the proc-macro
`#[error_from_attribute]`. Previosuly, this would always result in a
span pointing at `#[error_from_attribute]`

This will make many proc-macro-related error message much more useful -
when a proc-macro generates code containing an error, users will get an
error message pointing directly at that code (within the macro
definition), instead of always getting a span pointing at the macro
invocation site.

This is implemented as follows:
* When a proc-macro crate is being *compiled*, it causes the `quote!`
  macro to get run. This saves all of the sapns in the input to `quote!`
  into the metadata of *the proc-macro-crate* (which we are currently
  compiling). The `quote!` macro then expands to a call to
  `proc_macro::Span::recover_proc_macro_span(id)`, where `id` is an
opaque identifier for the span in the crate metadata.
* When the same proc-macro crate is *run* (e.g. it is loaded from disk
  and invoked by some consumer crate), the call to
`proc_macro::Span::recover_proc_macro_span` causes us to load the span
from the proc-macro crate's metadata. The proc-macro then produces a
`TokenStream` containing a `Span` pointing into the proc-macro crate
itself.

The recursive nature of 'quote!' can be difficult to understand at
first. The file `src/test/ui/proc-macro/quote-debug.stdout` shows
the output of the `quote!` macro, which should make this eaier to
understand.

This PR also supports custom quoting spans in custom quote macros (e.g.
the `quote` crate). All span quoting goes through the
`proc_macro::quote_span` method, which can be called by a custom quote
macro to perform span quoting. An example of this usage is provided in
`src/test/ui/proc-macro/auxiliary/custom-quote.rs`

Custom quoting currently has a few limitations:

In order to quote a span, we need to generate a call to
`proc_macro::Span::recover_proc_macro_span`. However, proc-macros
support renaming the `proc_macro` crate, so we can't simply hardcode
this path. Previously, the `quote_span` method used the path
`crate::Span` - however, this only works when it is called by the
builtin `quote!` macro in the same crate. To support being called from
arbitrary crates, we need access to the name of the `proc_macro` crate
to generate a path. This PR adds an additional argument to `quote_span`
to specify the name of the `proc_macro` crate. Howver, this feels kind
of hacky, and we may want to change this before stabilizing anything
quote-related.

Additionally, using `quote_span` currently requires enabling the
`proc_macro_internals` feature. The builtin `quote!` macro
has an `#[allow_internal_unstable]` attribute, but this won't work for
custom quote implementations. This will likely require some additional
tricks to apply `allow_internal_unstable` to the span of
`proc_macro::Span::recover_proc_macro_span`.
2021-05-12 00:51:31 -04:00
Andy Wang
37dbe868c9
Split span_to_string into span_to_diagnostic/embeddable_string 2021-05-11 00:04:12 +01:00
Andy Wang
5417b45c26
Use local and remapped paths where appropriate 2021-05-05 15:31:28 +01:00
Andy Wang
fb4f6439f6
Revamp RealFileName public methods 2021-05-05 15:31:03 +01:00
Andy Wang
f8e55da6de
Remove impl Display for FileName and add FileNameDisplay wrapper type 2021-05-05 15:31:02 +01:00
Andy Wang
9e0426d784
Make local_path in RealFileName::Remapped Option to be removed in exported metadata 2021-05-05 15:10:57 +01:00
Andy Wang
6720a37042
Rename RealFileName::Named to LocalPath and Devirtualized to Remapped 2021-05-05 15:10:50 +01:00
Mark Rousskov
5065144d6d Use new thread-local const-init
Let's see if this gives us any speedup - some of the TLS state modified in this
commit *is* pretty heavily accessed, so we can hope!
2021-05-02 14:06:07 -04:00
Ralf Jung
bd9556956a fix feature use in rustc libs 2021-04-18 22:05:45 +02:00
Joshua Nelson
441dc3640a Remove (lots of) dead code
Found with https://github.com/est31/warnalyzer.

Dubious changes:
- Is anyone else using rustc_apfloat? I feel weird completely deleting
  x87 support.
- Maybe some of the dead code in rustc_data_structures, in case someone
  wants to use it in the future?
- Don't change rustc_serialize

  I plan to scrap most of the json module in the near future (see
  https://github.com/rust-lang/compiler-team/issues/418) and fixing the
  tests needed more work than I expected.

TODO: check if any of the comments on the deleted code should be kept.
2021-03-27 22:16:33 -04:00
Mara Bos
cfb4ad4f2a Remove unwrap_none/expect_none from compiler/. 2021-03-18 14:25:54 +01:00
Camille GILLOT
34e92bbf65 Hash SyntaxContext first. 2021-03-11 12:31:31 +01:00
Camille GILLOT
fe2d728e62 Remove useless method. 2021-03-11 12:24:58 +01:00
bors
76c500ec6c Auto merge of #81635 - michaelwoerister:structured_def_path_hash, r=pnkfelix
Let a portion of DefPathHash uniquely identify the DefPath's crate.

This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot.

It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code.

The following new piece of documentation describes the most interesting aspects of the changes:

```rust
/// A `DefPathHash` is a fixed-size representation of a `DefPath` that is
/// stable across crate and compilation session boundaries. It consists of two
/// separate 64-bit hashes. The first uniquely identifies the crate this
/// `DefPathHash` originates from (see [StableCrateId]), and the second
/// uniquely identifies the corresponding `DefPath` within that crate. Together
/// they form a unique identifier within an entire crate graph.
///
/// There is a very small chance of hash collisions, which would mean that two
/// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation
/// with such a hash collision would very probably lead to an ICE and, in the
/// worst case, to a silent mis-compilation. The compiler therefore actively
/// and exhaustively checks for such hash collisions and aborts compilation if
/// it finds one.
///
/// `DefPathHash` uses 64-bit hashes for both the crate-id part and the
/// crate-internal part, even though it is likely that there are many more
/// `LocalDefId`s in a single crate than there are individual crates in a crate
/// graph. Since we use the same number of bits in both cases, the collision
/// probability for the crate-local part will be quite a bit higher (though
/// still very small).
///
/// This imbalance is not by accident: A hash collision in the
/// crate-local part of a `DefPathHash` will be detected and reported while
/// compiling the crate in question. Such a collision does not depend on
/// outside factors and can be easily fixed by the crate maintainer (e.g. by
/// renaming the item in question or by bumping the crate version in a harmless
/// way).
///
/// A collision between crate-id hashes on the other hand is harder to fix
/// because it depends on the set of crates in the entire crate graph of a
/// compilation session. Again, using the same crate with a different version
/// number would fix the issue with a high probability -- but that might be
/// easier said then done if the crates in questions are dependencies of
/// third-party crates.
///
/// That being said, given a high quality hash function, the collision
/// probabilities in question are very small. For example, for a big crate like
/// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there
/// is a probability of roughly 1 in 14,750,000,000 of a crate-internal
/// collision occurring. For a big crate graph with 1000 crates in it, there is
/// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision.
```

Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book?

r? `@pnkfelix` (feel free to re-assign)
2021-03-07 23:45:57 +00:00
Guillaume Gomez
0db8349fff
Rollup merge of #81940 - jhpratt:stabilize-str_split_once, r=m-ou-se
Stabilize str_split_once

Closes #74773
2021-02-26 15:52:29 +01:00
Vadim Petrochenkov
0038eaee6b rustc_span: Remove obsolete allow_internal_unstable_backcompat_hack 2021-02-14 19:42:55 +03:00
Jacob Pratt
c28f2a8bee
Stabilize str_split_once 2021-02-09 23:17:11 -05:00
klensy
60cca83975 faster spans 2021-02-04 04:54:23 +03:00
Michael Woerister
22d489be76 Let a portion of DefPathHash uniquely identify the DefPath's crate.
This allows to directly map from a DefPathHash to the crate it
originates from, without constructing side tables to do that mapping.

It also allows to reliably and cheaply check for DefPathHash collisions.
2021-02-02 17:40:29 +01:00
Aaron Hill
3540f9396a
Add disambiugator to ExpnData
Due to macro expansion, its possible to end up with two distinct
`ExpnId`s that have the same `ExpnData` contents. This violates the
contract of `HashStable`, since two unequal `ExpnId`s will end up with
equal `Fingerprint`s.

This commit adds a `disambiguator` field to `ExpnData`, which is used to
force two otherwise-equivalent `ExpnData`s to be distinct.
2021-01-23 15:41:17 -05:00
Aaron Hill
482a67d20f
Properly handle SyntaxContext of dummy spans in incr comp
Fixes #80336

Due to macro expansion, we may end up with spans with an invalid
location and non-root `SyntaxContext`. This commits preserves the
`SyntaxContext` of such spans in the incremental cache, and ensures
that we always hash the `SyntaxContext` when computing the `Fingerprint`
of a `Span`

Previously, we would discard the `SyntaxContext` during serialization to
the incremental cache, causing the span's `Fingerprint` to change across
compilation sessions.
2021-01-13 15:20:29 -05:00
bors
fe531d5a5f Auto merge of #79012 - tgnottingham:span_data_to_lines_and_cols, r=estebank
rustc_span: add span_data_to_lines_and_cols to caching source map view
2021-01-11 21:32:50 +00:00
Vadim Petrochenkov
3ff866ed7c resolve: Scope visiting doesn't need an Ident 2021-01-07 16:09:47 +03:00
Mara Bos
f16ef7d7ce Add edition 2021. 2020-12-31 19:06:09 +01:00
Yuki Okushi
6064be7ced
Rollup merge of #80358 - pierwill:edit_rustc_span, r=lcnr
Edit rustc_span documentation

Various changes to the `rustc_span` docs, including the following:

- Additions to top-level docs
- Edits to the source_map module docs
- Edits to documentation for `Span` and `SpanData`
- Added intra-docs links
- Documentation for Levenshtein distances
- Fixed missing punctuation
2020-12-30 18:15:11 +09:00
pierwill
a8775d44e9 Edit rustc_span documentation
Various changes to the `rustc_span` docs, including the following:

- Additions to top-level docs
- Edits to the source_map module docs
- Edits to documentation for `Span` and `SpanData`
- Added intra-docs links
- Documentation for Levenshtein distances
- Fixed missing punctuation
2020-12-25 14:02:52 -08:00
Arpad Borsos
830ceaa419 Remap instrument-coverage line numbers in doctests
This uses the `SourceMap::doctest_offset_line` method to re-map line
numbers from doctests. Remapping columns is not yet done.

Part of issue #79417.
2020-12-19 13:22:24 +01:00
Tyson Nottingham
75de8286c0 rustc_span: add span_data_to_lines_and_cols to caching source map view
Gives a performance increase over calling byte_pos_to_line_and_col
twice, partially because it decreases the function calling overhead,
potentially because it doesn't populate the line cache with lines that
turn out to belong to invalid spans, and likely because of some other
incidental improvements made possible by having more context available.
2020-12-03 18:36:34 -08:00
Joshua Nelson
0ad3dce83a Fix some clippy lints 2020-12-03 17:08:19 -05:00
Arlie Davis
5481c1bd6d Move lev_distance to rustc_ast, make non-generic
rustc_ast currently has a few dependencies on rustc_lexer. Ideally, an AST
would not have any dependency its lexer, for minimizing unnecessarily
design-time dependencies. Breaking this dependency would also have practical
benefits, since modifying rustc_lexer would not trigger a rebuild of rustc_ast.

This commit does not remove the rustc_ast --> rustc_lexer dependency,
but it does remove one of the sources of this dependency, which is the
code that handles fuzzy matching between symbol names for making suggestions
in diagnostics. Since that code depends only on Symbol, it is easy to move
it to rustc_span. It might even be best to move it to a separate crate,
since other tools such as Cargo use the same algorithm, and have simply
contain a duplicate of the code.

This changes the signature of find_best_match_for_name so that it is no
longer generic over its input. I checked the optimized binaries, and this
function was duplicated at nearly every call site, because most call sites
used short-lived iterator chains, generic over Map and such. But there's
no good reason for a function like this to be generic, since all it does
is immediately convert the generic input (the Iterator impl) to a concrete
Vec<Symbol>. This has all of the costs of generics (duplicated method bodies)
with no benefit.

Changing find_best_match_for_name to be non-generic removed about 10KB of
code from the optimized binary. I know it's a drop in the bucket, but we have
to start reducing binary size, and beginning to tame over-use of generics
is part of that.
2020-11-24 16:12:23 -08:00
bors
9722952f0b Auto merge of #76256 - tgnottingham:issue-74890, r=nikomatsakis
incr-comp: hash and serialize span end line/column

Hash both the length and the end location (line/column) of a span. If we
hash only the length, for example, then two otherwise equal spans with
different end locations will have the same hash. This can cause a
problem during incremental compilation wherein a previous result for a
query that depends on the end location of a span will be incorrectly
reused when the end location of the span it depends on has changed. A
similar analysis applies if some query depends specifically on the
length of the span, but we only hash the end location. So hash both.

Fix #46744, fix #59954, fix #63161, fix #73640, fix #73967, fix #74890, fix #75900

---

See #74890 for a more in-depth analysis.

I haven't thought about what other problems this root cause could be responsible for. Please let me know if anything springs to mind. I believe the issue has existed since the inception of incremental compilation.
2020-11-12 15:34:09 +00:00
Matthias Krüger
bcd2f2df67 fix a couple of clippy warnings:
filter_next
manual_strip
redundant_static_lifetimes
single_char_pattern
unnecessary_cast
unused_unit
op_ref
redundant_closure
useless_conversion
2020-11-04 13:48:50 +01:00
Tyson Nottingham
b71e627b26 incr-comp: hash span end line/column
Hash both the length and the end location (line/column) of a span. If we
hash only the length, for example, then two otherwise equal spans with
different end locations will have the same hash. This can cause a
problem during incremental compilation wherein a previous result for a
query that depends on the end location of a span will be incorrectly
reused when the end location of the span it depends on has changed. A
similar analysis applies if some query depends specifically on the
length of the span, but we only hash the end location. So hash both.

Fix #46744, fix #59954, fix #63161, fix #73640, fix #73967, fix #74890, fix #75900
2020-11-04 01:37:18 -08:00
Mara Bos
52405f7c0c
Rollup merge of #77950 - arlosi:sha256, r=eddyb
Add support for SHA256 source file hashing

Adds support for `-Z src-hash-algorithm sha256`, which became available in LLVM 11.

Using an older version of LLVM will cause an error `invalid checksum kind` if the hash algorithm is set to sha256.

r? `@eddyb`
cc #70401 `@est31`
2020-11-03 19:32:26 +01:00
Jonas Schievink
151db25599
Rollup merge of #78423 - tgnottingham:caching_source_map_bounds_check, r=oli-obk
rustc_span: improve bounds checks in byte_pos_to_line_and_col

The effect of this change is to consider edge-case spans that start or
end at the position one past the end of a file to be valid during span
hashing and encoding. This change means that these spans will be
preserved across incremental compilation sessions when they are part of
a serialized query result, instead of causing the dummy span to be used.
2020-10-29 17:05:17 +01:00
bors
31ee872db5 Auto merge of #78415 - tgnottingham:expn_id_tag_hash, r=Aaron1011
rustc_span: avoid hashing ExpnId tag when using cached hash
2020-10-28 20:03:55 +00:00
Tyson Nottingham
47dad31a04 rustc_span: represent line bounds with Range 2020-10-27 15:47:29 -07:00
Dániel Buga
da64d07191 Fix typo in comment 2020-10-27 17:08:14 +01:00
Tyson Nottingham
df59a44fea rustc_span: improve bounds checks in byte_pos_to_line_and_col
The effect of this change is to consider edge-case spans that start or
end at the position one past the end of a file to be valid during span
hashing and encoding. This change means that these spans will be
preserved across incremental compilation sessions when they are part of
a serialized query result, instead of causing the dummy span to be used.
2020-10-26 16:34:04 -07:00
Tyson Nottingham
a3623e0542 rustc_span: avoid hashing ExpnId tag when using cached hash 2020-10-26 13:43:48 -07:00
Arlo Siemsen
3296d5ca7b Add support for SHA256 source file hashing for LLVM 11+. 2020-10-14 15:09:51 -07:00
est31
58b3923ad3 Remove unused code from rustc_span 2020-10-14 04:14:32 +02:00