Loadable syntax extensions don't work when cross compiling (see #12102), so the
fourcc tests all need to be ignored. They're valuable tests, so they shouldn't
be outright ignored, so they're now flagged with ignore-cross-compile
This resolves issue #12157. Does that do it already or is there something else that needs taking care of?
As a side note, there seems to be some documentation, in which the old existence of the do keyword is explained. The list of keywords is not up-to-date either. But these are certainly separate issues.
Resolves issue #12157. `do` is hereby reinstated as a keyword; no syntax is
associated with it though. Along the way, a unit test had to be adapted, since
it was using `do` as a method identifier.
Breaking changes:
- Any code using `do` as an identifier will no longer work.
This is needed for cases where we only need to know if a list item matches the given predicate (eg. in Servo, we need to know if attributes from different DOM elements are equal).
The current comment actually describes *co*-variance.
Fixing this to describe contravariance while keeping 'static in the definition was tricky so just changed to use 'short and 'long.
I found the typo in my attempt to understand the concept of variance itself and the comment confused me. I mention this to point out that I'm new to the concept so may have still got the definition wrong, so please review with care :)
This is a fairly trivial (but IMHO handy) change to implement IterBytes for IpAddr and SocketAddr.
I originally stumbled across this because I wanted to use a SocketAddr as a HashMap key and discovered that I couldn't do it directly. Had to impl IterBytes on a new intermediate type to work around it.
This is needed for cases where we only need to know if a list item
matches the given predicate (eg. in Servo, we need to know if attributes
from different DOM elements are equal).
The previous definition was actually describing covariance.
Fixing to describe contravariance while keeping 'static in the definition was tricky so just changed to use 'short and 'long.
Thinking about swap as an example of unsafe programming. This cleans it up a bit. It also removes type parametrization over `RawPtr` from the memcpy functions to make this compile.
It unsafe assumptions that any impl of RawPtr is for actual pointers,
that they can be copied by memcpy. Removing it is easy, so I don't
think it's solving a real problem.
Repair a rather embarassingly obvious hole that I created as part of #9629. In particular, prevent `&mut` borrows of data in an aliasable location. This used to be prevented through the restrictions mechanism, but in #9629 I modified those rules incorrectly.
r? @pcwalton
Fixes#11913
I was looking into #9303 and was curious if this would still be valuable. @kballard had already done 99% of the work, so I brought the branch up to date and added a feature gate. Any feedback would be appreciated; I wasn't sure if this should be set up as a syntax extension with `#[macro_registrar]`, and if so, where it should be located.
Original PR is here: #9255
TODO:
* [x] Convert to loadable syntax extension
* [x] Default to big endian
* [x] Add `target` identifier
* [x] Expand to include code points 128-255
It was decided that a consistent result across platforms would be the
most useful and least surprising. A "target" option has been added to
get the old behaviour of using the target platform's endianess.
fourcc!() allows you to embed FourCC (or OSType) values that are
evaluated as u32 literals. It takes a 4-byte ASCII string and produces
the u32 resulting in interpreting those 4 bytes as a u32, using either
the platform-native endianness, or explicitly as big or little endian.
I don't know if anything depends on MemReader::fill returning an empty slice instead of EndOfFile, but I'm pretty sure that MemReader::read_until should not go into an infinite loop.
These are ancient. I removed a bunch of questions that are less relevant - or completely unrelevant, updated other entries, and removed things that are already better expressed elsewhere.
Before:
test test::bench_nonpod_nonarena ... bench: 62 ns/iter (+/- 6)
test test::bench_pod_nonarena ... bench: 0 ns/iter (+/- 0)
After:
test test::bench_nonpod_nonarena ... bench: 158 ns/iter (+/- 11)
test test::bench_pod_nonarena ... bench: 48 ns/iter (+/- 2)
The other tests show no change, but are adjusted to use the generic
return value of `.iter` anyway so that this doesn't change in future.
benchmarking.
This allows a result to be marked as "used" by passing it to a function
LLVM cannot see inside. By making `iter` generic and using this
`black_box` on the result benchmarks can get this behaviour simply by
returning their computation.
This pull request tries to fix#12050.
I went after these wrong errors quite aggressively so it might be that I also changed some strings that are not actual errors.
Please point those out and I will update this pull request accordingly.
Error messages cleaned in librustc/middle
Error messages cleaned in libsyntax
Error messages cleaned in libsyntax more agressively
Error messages cleaned in librustc more aggressively
Fixed affected tests
Fixed other failing tests
Last failing tests fixed
The lexer and json were using `transmute(-1): char` as a sentinel value for EOF, which is invalid since `char` is strictly a unicode codepoint.
Fixing this allows for range asserts on chars since they always lie between 0 and 0x10FFFF.
Declare a `type SendStr = MaybeOwned<'static>` to ease readibility of
types that needed the old SendStr behavior.
Implement all the traits for MaybeOwned that SendStr used to implement.
- Convert the formatting traits to `&self` rather than `_: &Self`
- Rejig `syntax::ext::{format,deriving}` a little in preparation
- Implement `#[deriving(Show)]`
The transmute was unsound.
There are many instances of .unwrap_or('\x00') for "ignoring" EOF which
either do not make the situation worse than it was (well, actually make
it better, since it's easy to grep for places that don't handle EOF) or
can never ever be read.
Fixes#8971.
This also drops support for the managed pointer POISON_ON_FREE feature
as it's not worth adding back the support for it. After a snapshot, the
leftovers can be removed.
This pull request:
1) Changes the initial insertion sort to be in-place, and defers allocation of working set until merge is needed.
2) Increases the increases the maximum run length to use insertion sort for from 8 to 32 elements. This increases the size of vectors that will not allocate, and reduces the number of merge passes by two. It seemed to be the sweet spot in the benchmarks that I ran.
Here are the results of some benchmarks. Note that they are sorting u64s, so types that are more expensive to compare or copy may have different behaviors.
Before changes:
```
test vec::bench::sort_random_large bench: 719753 ns/iter (+/- 130173) = 111 MB/s
test vec::bench::sort_random_medium bench: 4726 ns/iter (+/- 742) = 169 MB/s
test vec::bench::sort_random_small bench: 344 ns/iter (+/- 76) = 116 MB/s
test vec::bench::sort_sorted bench: 437244 ns/iter (+/- 70043) = 182 MB/s
```
Deferred allocation (8 element insertion sort):
```
test vec::bench::sort_random_large bench: 702630 ns/iter (+/- 88158) = 113 MB/s
test vec::bench::sort_random_medium bench: 4529 ns/iter (+/- 497) = 176 MB/s
test vec::bench::sort_random_small bench: 185 ns/iter (+/- 49) = 216 MB/s
test vec::bench::sort_sorted bench: 425853 ns/iter (+/- 60907) = 187 MB/s
```
Deferred allocation (16 element insertion sort):
```
test vec::bench::sort_random_large bench: 692783 ns/iter (+/- 165837) = 115 MB/s
test vec::bench::sort_random_medium bench: 4434 ns/iter (+/- 722) = 180 MB/s
test vec::bench::sort_random_small bench: 187 ns/iter (+/- 38) = 213 MB/s
test vec::bench::sort_sorted bench: 393783 ns/iter (+/- 85548) = 203 MB/s
```
Deferred allocation (32 element insertion sort):
```
test vec::bench::sort_random_large bench: 682556 ns/iter (+/- 131008) = 117 MB/s
test vec::bench::sort_random_medium bench: 4370 ns/iter (+/- 1369) = 183 MB/s
test vec::bench::sort_random_small bench: 179 ns/iter (+/- 32) = 223 MB/s
test vec::bench::sort_sorted bench: 358353 ns/iter (+/- 65423) = 223 MB/s
```
Deferred allocation (64 element insertion sort):
```
test vec::bench::sort_random_large bench: 712040 ns/iter (+/- 132454) = 112 MB/s
test vec::bench::sort_random_medium bench: 4425 ns/iter (+/- 784) = 180 MB/s
test vec::bench::sort_random_small bench: 179 ns/iter (+/- 81) = 223 MB/s
test vec::bench::sort_sorted bench: 317812 ns/iter (+/- 62675) = 251 MB/s
```
This is the best I could manage with the basic merge sort while keeping the invariant that the original vector must contain each element exactly once when the comparison function is called. If one is not married to a stable sort, an in-place n*log(n) sorting algorithm may have better performance in some cases.
for #12011
cc @huonw
Added a seperate in-place insertion sort for short vectors.
Increased threshold for insertion short for 8 to 32 elements
for small types and 16 for larger types. Added benchmarks
for sorting larger types.
This PR extends the tidy formatting check to rust files in the test folder. To facilitate this, a few flags were added to tidy:
* `xfail-tidy-cr` - Disables the check for CR characters for all following lines in the file
* `xfail-tidy-tab` - Disables the check for tab characters for all following lines in the file
* `xfail-tidy-linelength` - Disables the line length check for all following lines in the file
Checks should not have to be disabled often. I disabled line length checks in `debug-info` tests that use `debugger:` checks, but aside from that, there were relatively few exclusions. Running tidy on all the tests does slow down the formatting check, so it may be worth investigating further optimization.
cc #4534
`from_utf8_lossy()` takes a byte vector and produces a `~str`, converting
any invalid UTF-8 sequence into the U+FFFD REPLACEMENT CHARACTER.
The replacement follows the guidelines in §5.22 Best Practice for U+FFFD
Substitution from the Unicode Standard (Version 6.2)[1], which also
matches the WHATWG rules for utf-8 decoding[2].
[1]: http://www.unicode.org/versions/Unicode6.2.0/ch05.pdf
[2]: http://encoding.spec.whatwg.org/#utf-8Closes#9516.
Part of #8784
Changes:
- Everything labeled under collections in libextra has been moved into a new crate 'libcollection'.
- Renamed container.rs to deque.rs, since it was no longer 'container traits for extra', just a deque trait.
- Crates that depend on the collections have been updated and dependencies sorted.
- I think I changed all the imports in the tests to make sure it works. I'm not entirely sure, as near the end of the tests there was yet another `use` that I forgot to change, and when I went to try again, it started rebuilding everything, which I don't currently have time for.
There will probably be incompatibility between this and the other pull requests that are splitting up libextra. I'm happy to rebase once those have been merged.
The tests I didn't get to run should pass. But I can redo them another time if they don't.
from_utf8_lossy() takes a byte vector and produces a ~str, converting
any invalid UTF-8 sequence into the U+FFFD REPLACEMENT CHARACTER.
The replacement follows the guidelines in §5.22 Best Practice for U+FFFD
Substitution from the Unicode Standard (Version 6.2)[1], which also
matches the WHATWG rules for utf-8 decoding[2].
[1]: http://www.unicode.org/versions/Unicode6.2.0/ch05.pdf
[2]: http://encoding.spec.whatwg.org/#utf-8
This has been a long time coming. Conditions in rust were initially envisioned
as being a good alternative to error code return pattern. The idea is that all
errors are fatal-by-default, and you can opt-in to handling the error by
registering an error handler.
While sounding nice, conditions ended up having some unforseen shortcomings:
* Actually handling an error has some very awkward syntax:
let mut result = None;
let mut answer = None;
io::io_error::cond.trap(|e| { result = Some(e) }).inside(|| {
answer = Some(some_io_operation());
});
match result {
Some(err) => { /* hit an I/O error */ }
None => {
let answer = answer.unwrap();
/* deal with the result of I/O */
}
}
This pattern can certainly use functions like io::result, but at its core
actually handling conditions is fairly difficult
* The "zero value" of a function is often confusing. One of the main ideas
behind using conditions was to change the signature of I/O functions. Instead
of read_be_u32() returning a result, it returned a u32. Errors were notified
via a condition, and if you caught the condition you understood that the "zero
value" returned is actually a garbage value. These zero values are often
difficult to understand, however.
One case of this is the read_bytes() function. The function takes an integer
length of the amount of bytes to read, and returns an array of that size. The
array may actually be shorter, however, if an error occurred.
Another case is fs::stat(). The theoretical "zero value" is a blank stat
struct, but it's a little awkward to create and return a zero'd out stat
struct on a call to stat().
In general, the return value of functions that can raise error are much more
natural when using a Result as opposed to an always-usable zero-value.
* Conditions impose a necessary runtime requirement on *all* I/O. In theory I/O
is as simple as calling read() and write(), but using conditions imposed the
restriction that a rust local task was required if you wanted to catch errors
with I/O. While certainly an surmountable difficulty, this was always a bit of
a thorn in the side of conditions.
* Functions raising conditions are not always clear that they are raising
conditions. This suffers a similar problem to exceptions where you don't
actually know whether a function raises a condition or not. The documentation
likely explains, but if someone retroactively adds a condition to a function
there's nothing forcing upstream users to acknowledge a new point of task
failure.
* Libaries using I/O are not guaranteed to correctly raise on conditions when an
error occurs. In developing various I/O libraries, it's much easier to just
return `None` from a read rather than raising an error. The silent contract of
"don't raise on EOF" was a little difficult to understand and threw a wrench
into the answer of the question "when do I raise a condition?"
Many of these difficulties can be overcome through documentation, examples, and
general practice. In the end, all of these difficulties added together ended up
being too overwhelming and improving various aspects didn't end up helping that
much.
A result-based I/O error handling strategy also has shortcomings, but the
cognitive burden is much smaller. The tooling necessary to make this strategy as
usable as conditions were is much smaller than the tooling necessary for
conditions.
Perhaps conditions may manifest themselves as a future entity, but for now
we're going to remove them from the standard library.
Closes#9795Closes#8968
This has been a long time coming. Conditions in rust were initially envisioned
as being a good alternative to error code return pattern. The idea is that all
errors are fatal-by-default, and you can opt-in to handling the error by
registering an error handler.
While sounding nice, conditions ended up having some unforseen shortcomings:
* Actually handling an error has some very awkward syntax:
let mut result = None;
let mut answer = None;
io::io_error::cond.trap(|e| { result = Some(e) }).inside(|| {
answer = Some(some_io_operation());
});
match result {
Some(err) => { /* hit an I/O error */ }
None => {
let answer = answer.unwrap();
/* deal with the result of I/O */
}
}
This pattern can certainly use functions like io::result, but at its core
actually handling conditions is fairly difficult
* The "zero value" of a function is often confusing. One of the main ideas
behind using conditions was to change the signature of I/O functions. Instead
of read_be_u32() returning a result, it returned a u32. Errors were notified
via a condition, and if you caught the condition you understood that the "zero
value" returned is actually a garbage value. These zero values are often
difficult to understand, however.
One case of this is the read_bytes() function. The function takes an integer
length of the amount of bytes to read, and returns an array of that size. The
array may actually be shorter, however, if an error occurred.
Another case is fs::stat(). The theoretical "zero value" is a blank stat
struct, but it's a little awkward to create and return a zero'd out stat
struct on a call to stat().
In general, the return value of functions that can raise error are much more
natural when using a Result as opposed to an always-usable zero-value.
* Conditions impose a necessary runtime requirement on *all* I/O. In theory I/O
is as simple as calling read() and write(), but using conditions imposed the
restriction that a rust local task was required if you wanted to catch errors
with I/O. While certainly an surmountable difficulty, this was always a bit of
a thorn in the side of conditions.
* Functions raising conditions are not always clear that they are raising
conditions. This suffers a similar problem to exceptions where you don't
actually know whether a function raises a condition or not. The documentation
likely explains, but if someone retroactively adds a condition to a function
there's nothing forcing upstream users to acknowledge a new point of task
failure.
* Libaries using I/O are not guaranteed to correctly raise on conditions when an
error occurs. In developing various I/O libraries, it's much easier to just
return `None` from a read rather than raising an error. The silent contract of
"don't raise on EOF" was a little difficult to understand and threw a wrench
into the answer of the question "when do I raise a condition?"
Many of these difficulties can be overcome through documentation, examples, and
general practice. In the end, all of these difficulties added together ended up
being too overwhelming and improving various aspects didn't end up helping that
much.
A result-based I/O error handling strategy also has shortcomings, but the
cognitive burden is much smaller. The tooling necessary to make this strategy as
usable as conditions were is much smaller than the tooling necessary for
conditions.
Perhaps conditions may manifest themselves as a future entity, but for now
we're going to remove them from the standard library.
Closes#9795Closes#8968
This commit removes the -c, --emit-llvm, -s, --rlib, --dylib, --staticlib,
--lib, and --bin flags from rustc, adding the following flags:
* --emit=[asm,ir,bc,obj,link]
* --crate-type=[dylib,rlib,staticlib,bin,lib]
The -o option has also been redefined to be used for *all* flavors of outputs.
This means that we no longer ignore it for libraries. The --out-dir remains the
same as before.
The new logic for files that rustc emits is as follows:
1. Output types are dictated by the --emit flag. The default value is
--emit=link, and this option can be passed multiple times and have all options
stacked on one another.
2. Crate types are dictated by the --crate-type flag and the #[crate_type]
attribute. The flags can be passed many times and stack with the crate
attribute.
3. If the -o flag is specified, and only one output type is specified, the
output will be emitted at this location. If more than one output type is
specified, then the filename of -o is ignored, and all output goes in the
directory that -o specifies. The -o option always ignores the --out-dir
option.
4. If the --out-dir flag is specified, all output goes in this directory.
5. If -o and --out-dir are both not present, all output goes in the directory of
the crate file.
6. When multiple output types are specified, the filestem of all output is the
same as the name of the CrateId (derived from a crate attribute or from the
filestem of the crate file).
Closes#7791Closes#11056Closes#11667
This commit removes the -c, --emit-llvm, -s, --rlib, --dylib, --staticlib,
--lib, and --bin flags from rustc, adding the following flags:
* --emit=[asm,ir,bc,obj,link]
* --crate-type=[dylib,rlib,staticlib,bin,lib]
The -o option has also been redefined to be used for *all* flavors of outputs.
This means that we no longer ignore it for libraries. The --out-dir remains the
same as before.
The new logic for files that rustc emits is as follows:
1. Output types are dictated by the --emit flag. The default value is
--emit=link, and this option can be passed multiple times and have all
options stacked on one another.
2. Crate types are dictated by the --crate-type flag and the #[crate_type]
attribute. The flags can be passed many times and stack with the crate
attribute.
3. If the -o flag is specified, and only one output type is specified, the
output will be emitted at this location. If more than one output type is
specified, then the filename of -o is ignored, and all output goes in the
directory that -o specifies. The -o option always ignores the --out-dir
option.
4. If the --out-dir flag is specified, all output goes in this directory.
5. If -o and --out-dir are both not present, all output goes in the current
directory of the process.
6. When multiple output types are specified, the filestem of all output is the
same as the name of the CrateId (derived from a crate attribute or from the
filestem of the crate file).
Closes#7791Closes#11056Closes#11667
A weak pointer inside itself will have its destructor run when the last
strong pointer to that data disappears, so we need to make sure that the
Weak and Rc destructors don't duplicate work (i.e. freeing).
By making the Rcs effectively take a weak pointer, we ensure that no
Weak destructor will free the pointer while still ensuring that Weak
pointers can't be upgraded to strong ones as the destructors run.
This approach of starting weak at 1 is what libstdc++ does.
Fixes#12046.
I have a hunch this just deadlocked the windows bots. Due to UDP being a lossy
protocol, I don't think we can guarantee that the server can receive both
packets, so just listen for one of them.
I have a hunch this just deadlocked the windows bots. Due to UDP being a lossy
protocol, I don't think we can guarantee that the server can receive both
packets, so just listen for one of them.
A weak pointer inside itself will have its destructor run when the last
strong pointer to that data disappears, so we need to make sure that the
Weak and Rc destructors don't duplicate work (i.e. freeing).
By making the Rcs effectively take a weak pointer, we ensure that no
Weak destructor will free the pointer while still ensuring that Weak
pointers can't be upgraded to strong ones as the destructors run.
This approach of starting weak at 1 is what libstdc++ does.
Fixes#12046.
This is part of the overall strategy I would like to take when approaching
issue #11165. The only two I/O objects that reasonably want to be "split" are
the network stream objects. Everything else can be "split" by just creating
another version.
The initial idea I had was the literally split the object into a reader and a
writer half, but that would just introduce lots of clutter with extra interfaces
that were a little unnnecssary, or it would return a ~Reader and a ~Writer which
means you couldn't access things like the remote peer name or local socket name.
The solution I found to be nicer was to just clone the stream itself. The clone
is just a clone of the handle, nothing fancy going on at the kernel level.
Conceptually I found this very easy to wrap my head around (everything else
supports clone()), and it solved the "split" problem at the same time.
The cloning support is pretty specific per platform/lib combination:
* native/win32 - uses some specific WSA apis to clone the SOCKET handle
* native/unix - uses dup() to get another file descriptor
* green/all - This is where things get interesting. When we support full clones
of a handle, this implies that we're allowing simultaneous writes
and reads to happen. It turns out that libuv doesn't support two
simultaneous reads or writes of the same object. It does support
*one* read and *one* write at the same time, however. Some extra
infrastructure was added to just block concurrent writers/readers
until the previous read/write operation was completed.
I've added tests to the tcp/unix modules to make sure that this functionality is
supported everywhere.
This is part of the overall strategy I would like to take when approaching
issue #11165. The only two I/O objects that reasonably want to be "split" are
the network stream objects. Everything else can be "split" by just creating
another version.
The initial idea I had was the literally split the object into a reader and a
writer half, but that would just introduce lots of clutter with extra interfaces
that were a little unnnecssary, or it would return a ~Reader and a ~Writer which
means you couldn't access things like the remote peer name or local socket name.
The solution I found to be nicer was to just clone the stream itself. The clone
is just a clone of the handle, nothing fancy going on at the kernel level.
Conceptually I found this very easy to wrap my head around (everything else
supports clone()), and it solved the "split" problem at the same time.
The cloning support is pretty specific per platform/lib combination:
* native/win32 - uses some specific WSA apis to clone the SOCKET handle
* native/unix - uses dup() to get another file descriptor
* green/all - This is where things get interesting. When we support full clones
of a handle, this implies that we're allowing simultaneous writes
and reads to happen. It turns out that libuv doesn't support two
simultaneous reads or writes of the same object. It does support
*one* read and *one* write at the same time, however. Some extra
infrastructure was added to just block concurrent writers/readers
until the previous read/write operation was completed.
I've added tests to the tcp/unix modules to make sure that this functionality is
supported everywhere.
- `extra::json` didn't make the cut, because of `extra::json` required
dep on `extra::TreeMap`. If/when `extra::TreeMap` moves out of `extra`,
then `extra::json` could move into `serialize`
- `libextra`, `libsyntax` and `librustc` depend on the newly created
`libserialize`
- The extensions to various `extra` types like `DList`, `RingBuf`, `TreeMap`
and `TreeSet` for `Encodable`/`Decodable` were moved into the respective
modules in `extra`
- There is some trickery, evident in `src/libextra/lib.rs` where a stub
of `extra::serialize` is set up (in `src/libextra/serialize.rs`) for
use in the stage0 build, where the snapshot rustc is still making
deriving for `Encodable` and `Decodable` point at extra. Big props to
@huonw for help working out the re-export solution for this
extra: inline extra::serialize stub
fix stuff clobbered in rebase + don't reexport serialize::serialize
no more globs in libserialize
syntax: fix import of libserialize traits
librustc: fix bad imports in encoder/decoder
add serialize dep to librustdoc
fix failing run-pass tests w/ serialize dep
adjust uuid dep
more rebase de-clobbering for libserialize
fixing tests, pushing libextra dep into cfg(test)
fix doc code in extra::json
adjust index.md links to serialize and uuid library
This time everything should be okay, No break due to a failed merge or rebase...
Sorry for the abuse of pull request.
So this move extra::sync, extra::arc, extra::future, extra::comm and extra::task_pool to libsync.
This allows patch adds a new arc type that allows for creation of copy-on-write data structures. The idea is that it is safe to mutate any data structure as long as it has only one reference to it. If there are multiple, it requires cloning of the data structure before mutation is possible.