since the serialization format isn't self-describing we need a way to detect
when encoder and decoder don't match up. but that doesn't have to
be utf8 validation for strings, which does cost a few % of performance.
Instead we can use a marker byte at the end to be reasonably
sure that we're dealing with a string and it wasn't overwritten in some
way.
The PR had some unforseen perf regressions that are not as easy to find.
Revert the PR for now.
This reverts commit 6ae8912a3e, reversing
changes made to 86d6d2b738.
Since RFC 3052 soft deprecated the authors field anyway, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information, we should remove it from
crates in this repo.
Allow for reading raw bytes from rustc_serialize::Decoder without unsafe code
The current `read_raw_bytes` method requires using `MaybeUninit` and `unsafe`. I don't think this is necessary. Let's see if a safe interface has any performance drawbacks.
This is a followup to #83273 and will make it easier to rebase #82183.
r? `@cjgillot`
The signed LEB128 decoding function used a hardcoded constant of 64
instead of the number of bits in the type of integer being decoded,
which resulted in incorrect results for some inputs. Fix this, make the
decoding more consistent with the unsigned version, and increase the
LEB128 encoding and decoding test coverage.
Reduce a large memory spike that happens during serialization by writing
the incr comp structures to file by way of a fixed-size buffer, rather
than an unbounded vector.
Effort was made to keep the instruction count close to that of the
previous implementation. However, buffered writing to a file inherently
has more overhead than writing to a vector, because each write may
result in a handleable error. To reduce this overhead, arrangements are
made so that each LEB128-encoded integer can be written to the buffer
with only one capacity and error check. Higher-level optimizations in
which entire composite structures can be written with one capacity and
error check are possible, but would require much more work.
The performance is mostly on par with the previous implementation, with
small to moderate instruction count regressions. The memory reduction is
significant, however, so it seems like a worth-while trade-off.