nordic-dev.net/rust - rust

mirror of https://github.com/rust-lang/rust.git synced 2025-04-30 12:07:40 +00:00

Author	SHA1	Message	Date
Simonas Kazlauskas	ee69cd7925	Calculate discriminant bounds within 64 bits Since discriminants do not support i128 yet, lets just calculate the boundaries within the 64 bits that are supported. This also avoids an issue with bootstrapping on 32 bit systems due to #38727.	2016-12-31 04:55:29 +02:00
Simonas Kazlauskas	9aad2d551e	Add a way to retrieve constant value in 128 bits Fixes rebase fallout, makes code correct in presence of 128-bit constants. This commit includes manual merge conflict resolution changes from a rebase by @est31.	2016-12-30 15:17:26 +01:00
Simonas Kazlauskas	d9eb756cbf	Wrapping<i128> and attempt at LLVM 3.7 compat This commit includes manual merge conflict resolution changes from a rebase by @est31.	2016-12-30 15:17:26 +01:00
Simonas Kazlauskas	b0e55a83a8	Such large. Very 128. Much bits. This commit introduces 128-bit integers. Stage 2 builds and produces a working compiler which understands and supports 128-bit integers throughout. The general strategy used is to have rustc_i128 module which provides aliases for iu128, equal to iu64 in stage9 and iu128 later. Since nowhere in rustc we rely on large numbers being supported, this strategy is good enough to get past the first bootstrap stages to end up with a fully working 128-bit capable compiler. In order for this strategy to work, number of locations had to be changed to use associated max_value/min_value instead of MAX/MIN constants as well as the min_value (or was it max_value?) had to be changed to use xor instead of shift so both 64-bit and 128-bit based consteval works (former not necessarily producing the right results in stage1). This commit includes manual merge conflict resolution changes from a rebase by @est31.	2016-12-30 15:15:44 +01:00
Alex Crichton	bcfd504744	Rollup merge of #38559 - japaric:ptx2, r=alexcrichton PTX support, take 2 - You can generate PTX using `--emit=asm` and the right (custom) target. Which then you can run on a NVIDIA GPU. - You can compile `core` to PTX. [Xargo] also works and it can compile some other crates like `collections` (but I doubt all of those make sense on a GPU) [Xargo]: https://github.com/japaric/xargo - You can create "global" functions, which can be "called" by the host, using the `"ptx-kernel"` ABI, e.g. `extern "ptx-kernel" fn kernel() { .. }`. Every other function is a "device" function and can only be called by the GPU. - Intrinsics like `__syncthreads()` and `blockIdx.x` are available as `"platform-intrinsics"`. These intrinsics are not in the `core` crate but any Rust user can create "bindings" to them using an `extern "platform-intrinsics"` block. See example at the end. - Trying to emit PTX with `-g` (debuginfo); you get an LLVM error. But I don't think PTX can contain debuginfo anyway so `-g` should be ignored and a warning should be printed ("`-g` doesn't work with this target" or something). - "Single source" support. You can't write a single source file that contains both host and device code. I think that should be possible to implement that outside the compiler using compiler plugins / build scripts. - The equivalent to CUDA `__shared__` which it's used to declare memory that's shared between the threads of the same block. This could be implemented using attributes: `#[shared] static mut SCRATCH_MEMORY: [f32; 64]` but hasn't been implemented yet. - Built-in targets. This PR doesn't add targets to the compiler just yet but one can create custom targets to be able to emit PTX code (see the example at the end). The idea is to have people experiment with this feature before committing to it (built-in targets are "insta-stable") - All functions must be "inlined". IOW, the `.rlib` must always contain the LLVM bitcode of all the functions of the crate it was produced from. Otherwise, you end with "undefined references" in the final PTX code but you won't get any linker error because no linker is involved. IOW, you'll hit a runtime error when loading the PTX into the GPU. The workaround is to use `#[inline]` on non-generic functions and to never use `#[inline(never)]` but this may not always be possible because e.g. you could be relying on third party code. - Should `--emit=asm` generate a `.ptx` file instead of a `.s` file? TL;DR Use Xargo to turn a crate into a PTX module (a `.s` file). Then pass that PTX module, as a string, to the GPU and run it. The full code is in [this repository]. This section gives an overview of how to run Rust code on a NVIDIA GPU. [this repository]: https://github.com/japaric/cuda - Create a custom target. Here's the 64-bit NVPTX target (NOTE: the comments are not valid because this is supposed to be a JSON file; remove them before you use this file): ``` js // nvptx64-nvidia-cuda.json { "arch": "nvptx64", // matches LLVM "cpu": "sm_20", // "oldest" compute capability supported by LLVM "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64", "llvm-target": "nvptx64-nvidia-cuda", "max-atomic-width": 0, // LLVM errors with any other value :-( "os": "cuda", // matches LLVM "panic-strategy": "abort", "target-endian": "little", "target-pointer-width": "64", "target-vendor": "nvidia", // matches LLVM -- not required } ``` (There's a 32-bit target specification in the linked repository) - Write a kernel ``` rust extern "platform-intrinsic" { fn nvptx_block_dim_x() -> i32; fn nvptx_block_idx_x() -> i32; fn nvptx_thread_idx_x() -> i32; } /// Copies an array of `n` floating point numbers from `src` to `dst` pub unsafe extern "ptx-kernel" fn memcpy(dst: mut f32, src: const f32, n: usize) { let i = (nvptx_block_dim_x() as isize) .wrapping_mul(nvptx_block_idx_x() as isize) .wrapping_add(nvptx_thread_idx_x() as isize); if (i as usize) < n { dst.offset(i) = src.offset(i); } } ``` - Emit PTX code ``` $ xargo rustc --target nvptx64-nvidia-cuda --release -- --emit=asm Compiling core v0.0.0 (file://..) (..) Compiling nvptx-builtins v0.1.0 (https://github.com/japaric/nvptx-builtins) Compiling kernel v0.1.0 $ cat target/nvptx64-nvidia-cuda/release/deps/kernel-.s // // Generated by LLVM NVPTX Back-End // .version 3.2 .target sm_20 .address_size 64 // .globl memcpy .visible .entry memcpy( .param .u64 memcpy_param_0, .param .u64 memcpy_param_1, .param .u64 memcpy_param_2 ) { .reg .pred %p<2>; .reg .s32 %r<5>; .reg .s64 %rd<12>; ld.param.u64 %rd7, [memcpy_param_2]; mov.u32 %r1, %ntid.x; mov.u32 %r2, %ctaid.x; mul.wide.s32 %rd8, %r2, %r1; mov.u32 %r3, %tid.x; cvt.s64.s32 %rd9, %r3; add.s64 %rd10, %rd9, %rd8; setp.ge.u64 %p1, %rd10, %rd7; @%p1 bra LBB0_2; ld.param.u64 %rd3, [memcpy_param_0]; ld.param.u64 %rd4, [memcpy_param_1]; cvta.to.global.u64 %rd5, %rd4; cvta.to.global.u64 %rd6, %rd3; shl.b64 %rd11, %rd10, 2; add.s64 %rd1, %rd6, %rd11; add.s64 %rd2, %rd5, %rd11; ld.global.u32 %r4, [%rd2]; st.global.u32 [%rd1], %r4; LBB0_2: ret; } ``` - Run it on the GPU ``` rust // `kernel.ptx` is the `.s` file we got in the previous step const KERNEL: &'static str = include_str!("kernel.ptx"); driver::initialize()?; let device = Device(0)?; let ctx = device.create_context()?; let module = ctx.load_module(KERNEL)?; let kernel = module.function("memcpy")?; let h_a: Vec<f32> = /* create some random data /; let h_b = vec![0.; N]; let d_a = driver::allocate(bytes)?; let d_b = driver::allocate(bytes)?; // Copy from host to GPU driver::copy(h_a, d_a)?; // Run `memcpy` on the GPU kernel.launch(d_b, d_a, N)?; // Copy from GPU to host driver::copy(d_b, h_b)?; // Verify assert_eq!(h_a, h_b); // `d_a`, `d_b`, `h_a`, `h_b` are dropped/freed here ``` --- cc @alexcrichton @brson @rkruppe > What has changed since #34195? - `core` now can be compiled into PTX. Which makes it very easy to turn `no_std` crates into "kernels" with the help of Xargo. - There's now a way, the `"ptx-kernel"` ABI, to generate "global" functions. The old PR required a manual step (it was hack) to "convert" "device" functions into "global" functions. (Only "global" functions can be launched by the host) - Everything is unstable. There are not "insta stable" built-in targets this time (\). The users have to use a custom target to experiment with this feature. Also, PTX instrinsics, like `__syncthreads` and `blockIdx.x`, are now implemented as `"platform-intrinsics"` so they no longer live in the `core` crate. (\*) I'd actually like to have in-tree targets because that makes this target more discoverable, removes the need to lug around .json files, etc. However, bundling a target with the compiler immediately puts it in the path towards stabilization. Which gives us just two cycles to find and fix any problem with the target specification. Afterwards, it becomes hard to tweak the specification because that could be a breaking change. A possible solution could be "unstable built-in targets". Basically, to use an unstable target, you'll have to also pass `-Z unstable-options` to the compiler. And unstable targets, being unstable, wouldn't be available on stable. > Why should this be merged? - To let people experiment with the feature out of tree. Having easy access to the feature (in every nightly) allows this. I also think that, as it is, it should be possible to start prototyping type-safe single source support using build scripts, macros and/or plugins. - It's a straightforward implementation. No different that adding support for any other architecture.	2016-12-29 17:26:15 -08:00
Alex Crichton	03bc2cf35a	Fallout from updating bootstrap Cargo	2016-12-29 08:47:26 -08:00
Jorge Aparicio	18d49288d5	PTX support - `--emit=asm --target=nvptx64-nvidia-cuda` can be used to turn a crate into a PTX module (a `.s` file). - intrinsics like `__syncthreads` and `blockIdx.x` are exposed as `"platform-intrinsics"`. - "cabi" has been implemented for the nvptx and nvptx64 architectures. i.e. `extern "C"` works. - a new ABI, `"ptx-kernel"`. That can be used to generate "global" functions. Example: `extern "ptx-kernel" fn kernel() { .. }`. All other functions are "device" functions.	2016-12-26 21:06:23 -05:00
Ivan Molodetskikh	c461cdfdf6	Fixed fastcall not applying inreg attributes to arguments like the C/C++ fastcall.	2016-12-21 21:44:40 +03:00
Mark-Simulacrum	cbbdb73eb0	Remove FunctionContext::cleanup, replacing it with a Drop impl. Move alloca and initial entry block creation into FunctionContext::new.	2016-12-20 20:03:27 -07:00
Jake Goulding	5bce12c95f	[LLVM 4.0] Move debuginfo alignment argument Alignment was removed from createBasicType and moved to - createGlobalVariable - createAutoVariable - createStaticMemberType (unused in Rust) - createTempGlobalVariableFwdDecl (unused in Rust) `e69c459a6e`	2016-12-12 09:00:04 -05:00
bors	1692c0b587	Auto merge of #37973 - vadimcn:dllimport, r=alexcrichton Implement RFC 1717 Implement the first two points from #37403. r? @alexcrichton	2016-12-06 10:54:45 +00:00
Michael Woerister	d1a6d47f94	Make LLVM symbol visibility FFI types more stable.	2016-12-05 11:05:25 -05:00
bors	125474de07	Auto merge of #37857 - shepmaster:llvm-4.0-dinodes, r=michaelwoerister [LLVM 4.0] Handle new DIFlags enum	2016-12-04 02:30:23 +00:00
bors	08faff49c3	Auto merge of #38055 - rkruppe:rm-unused-llvm-ffi, r=alexcrichton Remove unused functions from rustc_llvm	2016-12-03 03:57:57 +00:00
Jake Goulding	dbdd60e6d7	[LLVM] Introduce a stable representation of DIFlags In LLVM 4.0, this enum becomes an actual type-safe enum, which breaks all of the interfaces. Introduce our own copy of the bitflags that we can then safely convert to the LLVM one.	2016-12-02 21:13:31 -05:00
Vadim Chugunov	a9a6f8c8ed	Remove the "linked_from" feature.	2016-12-01 16:56:49 -08:00
Robin Kruppe	6e35cc94de	Remove unused functions from rustc_llvm	2016-11-28 21:59:26 +01:00
Robin Kruppe	85dc08e525	Don't assume llvm::StringRef is null terminated StringRefs have a length and their contents are not usually null-terminated. The solution is to either copy the string data (in rustc_llvm::diagnostic) or take the size into account (in LLVMRustPrintPasses). I couldn't trigger a bug caused by this (apparently all the strings returned in practice are actually null-terminated) but this is more correct and more future-proof.	2016-11-28 17:33:13 +01:00
Robin Kruppe	730400167a	Support LLVM 4.0 in OptimizationDiagnostic FFI - getMsg() changed to return std::string by-value. Fix: copy the data to a rust String during unpacking. - getPassName() changed to return StringRef	2016-11-24 17:33:47 +01:00
Seo Sanghyeon	c45f3dee10	Restore compatibility with LLVM 3.7 and 3.8	2016-11-21 20:30:05 +09:00
bors	0bd2ce62b2	Auto merge of #37831 - rkruppe:llvm-attr-fwdcompat, r=eddyb [LLVM 4.0] Use llvm::Attribute APIs instead of "raw value" APIs The latter will be removed in LLVM 4.0 (see `4a6fc8bacf`). The librustc_llvm API remains mostly unchanged, except that llvm::Attribute is no longer a bitflag but represents only a single attribute. The ability to store many attributes in a small number of bits and modify them without interacting with LLVM is only used in rustc_trans::abi and closely related modules, and only attributes for function arguments are considered there. Thus rustc_trans::abi now has its own bit-packed representation of argument attributes, which are translated to rustc_llvm::Attribute when applying the attributes. cc #37609	2016-11-19 16:39:25 -06:00
Robin Kruppe	30daedf603	Use llvm::Attribute API instead of "raw value" APIs, which will be removed in LLVM 4.0. The librustc_llvm API remains mostly unchanged, except that llvm::Attribute is no longer a bitflag but represents only a single attribute. The ability to store many attributes in a small number of bits and modify them without interacting with LLVM is only used in rustc_trans::abi and closely related modules, and only attributes for function arguments are considered there. Thus rustc_trans::abi now has its own bit-packed representation of argument attributes, which are translated to rustc_llvm::Attribute when applying the attributes.	2016-11-17 21:12:26 +01:00
Jorge Aparicio	456ceba137	fix `extern "aapcs" fn` to actually use the AAPCS calling convention closes #37810 This is technically a [breaking-change] because it changes the ABI of `extern "aapcs"` functions that (a) involve `f32`/`f64` arguments/return values and (b) are compiled for arm-eabihf targets from "aapcs-vfp" (wrong) to "aapcs" (correct). Appendix: What these ABIs mean? - In the "aapcs-vfp" ABI or "hard float" calling convention: Floating point values are passed/returned through FPU registers (s0, s1, d0, etc.) - Whereas, in the "aapcs" ABI or "soft float" calling convention: Floating point values are passed/returned through general purpose registers (r0, r1, etc.) Mixing these ABIs can cause problems if the caller assumes that the routine is using one of these ABIs but it's actually using the other one.	2016-11-16 18:38:32 -05:00
Alex Crichton	88b46460fa	rustc: Flag all builtins functions as hidden When compiling compiler-rt you typically compile with `-fvisibility=hidden` which to ensure that all symbols are hidden in shared objects and don't show up in symbol tables. This is important for these intrinsics being linked in every crate to ensure that we're not unnecessarily bloating the public ABI of Rust crates. This should help allow the compiler-builtins project with Rust-defined builtins start landing in-tree as well.	2016-11-12 10:46:15 -08:00
Srinivas Reddy Thatiparthy	9972d17ecf	run rustfmt on librustc_llvm folder	2016-10-22 18:37:35 +05:30
Michael Woerister	db4a9b3446	debuginfo: Remove some outdated stuff from LLVM DIBuilder binding.	2016-10-14 14:56:33 -04:00
Matt Ickstadt	b9a8c1a063	Fix incorrect LLVM Linkage enum The `Linkage` enum in librustc_llvm got out of sync with the version in LLVM and it caused two variants of the #[linkage=""] attribute to break. This adds the functions `LLVMRustGetLinkage` and `LLVMRustSetLinkage` which convert between the Rust Linkage enum and the LLVM one, which should stop this from breaking every time LLVM changes it. Fixes #33992	2016-09-04 16:12:01 -05:00
CensoredUsername	516519ee9a	Allow specification of the system V AMD64 ABI constraint. This can be specified using `extern sysV64 fn` on all platforms	2016-08-30 16:01:40 +02:00
Vadim Chugunov	cf6461168f	Fix debug line info for macro expansions. Macro expansions produce code tagged with debug locations that are completely different from the surrounding expressions. This wrecks havoc on debugger's ability the step over source lines. In order to have a good line stepping behavior in debugger, we overwrite debug locations of macro expansions with that of the outermost expansion site.	2016-08-25 00:40:42 -07:00
Cameron Hart	cbb88faad7	Merge branch 'master' into issue-30961	2016-08-06 15:50:48 +10:00
Ariel Ben-Yehuda	3041a97b1a	finish type-auditing rustllvm	2016-08-03 15:08:47 +03:00
Ariel Ben-Yehuda	24874170b4	split the FFI part of rustc_llvm to rustc_llvm::ffi	2016-08-03 15:08:47 +03:00

1 2

82 Commits