mirror of
https://github.com/rust-lang/rust.git
synced 2025-06-04 19:29:07 +00:00
Copyedit FFI tutorial
This commit is contained in:
parent
4b3be853af
commit
cd6f24f9d1
@ -2,17 +2,15 @@
|
|||||||
|
|
||||||
# Introduction
|
# Introduction
|
||||||
|
|
||||||
One of Rust's aims, as a system programming language, is to
|
Because Rust is a systems programming language, one of its goals is to
|
||||||
interoperate well with C code.
|
interoperate well with C code.
|
||||||
|
|
||||||
We'll start with an example. It's a bit bigger than usual, and
|
We'll start with an example, which is a bit bigger than usual. We'll
|
||||||
contains a number of new concepts. We'll go over it one piece at a
|
go over it one piece at a time. This is a program that uses OpenSSL's
|
||||||
time.
|
`SHA1` function to compute the hash of its first command-line
|
||||||
|
argument, which it then converts to a hexadecimal string and prints to
|
||||||
This is a program that uses OpenSSL's `SHA1` function to compute the
|
standard output. If you have the OpenSSL libraries installed, it
|
||||||
hash of its first command-line argument, which it then converts to a
|
should compile and run without any extra effort.
|
||||||
hexadecimal string and prints to standard output. If you have the
|
|
||||||
OpenSSL libraries installed, it should 'just work'.
|
|
||||||
|
|
||||||
~~~~ {.xfail-test}
|
~~~~ {.xfail-test}
|
||||||
extern mod std;
|
extern mod std;
|
||||||
@ -32,7 +30,7 @@ fn sha1(data: ~str) -> ~str unsafe {
|
|||||||
let bytes = str::to_bytes(data);
|
let bytes = str::to_bytes(data);
|
||||||
let hash = crypto::SHA1(vec::raw::to_ptr(bytes),
|
let hash = crypto::SHA1(vec::raw::to_ptr(bytes),
|
||||||
vec::len(bytes) as c_uint, ptr::null());
|
vec::len(bytes) as c_uint, ptr::null());
|
||||||
return as_hex(vec::raw::from_buf(hash, 20u));
|
return as_hex(vec::raw::from_buf(hash, 20));
|
||||||
}
|
}
|
||||||
|
|
||||||
fn main(args: ~[~str]) {
|
fn main(args: ~[~str]) {
|
||||||
@ -42,26 +40,27 @@ fn main(args: ~[~str]) {
|
|||||||
|
|
||||||
# Foreign modules
|
# Foreign modules
|
||||||
|
|
||||||
Before we can call `SHA1`, we have to declare it. That is what this
|
Before we can call the `SHA1` function defined in the OpenSSL library, we have
|
||||||
part of the program is responsible for:
|
to declare it. That is what this part of the program does:
|
||||||
|
|
||||||
~~~~ {.xfail-test}
|
~~~~ {.xfail-test}
|
||||||
extern mod crypto {
|
extern mod crypto {
|
||||||
fn SHA1(src: *u8, sz: uint, out: *u8) -> *u8;
|
fn SHA1(src: *u8, sz: uint, out: *u8) -> *u8; }
|
||||||
}
|
|
||||||
~~~~
|
~~~~
|
||||||
|
|
||||||
An `extern` module declaration containing function signatures introduces
|
An `extern` module declaration containing function signatures introduces the
|
||||||
the functions listed as _foreign functions_, that are implemented in some
|
functions listed as _foreign functions_. Foreign functions differ from regular
|
||||||
other language (usually C) and accessed through Rust's foreign function
|
Rust functions in that they are implemented in some other language (usually C)
|
||||||
interface (FFI). An extern module like this is called a foreign module, and
|
and called through Rust's foreign function interface (FFI). An extern module
|
||||||
implicitly tells the compiler to link with a library with the same name as
|
like this is called a foreign module, and implicitly tells the compiler to
|
||||||
the module, and that it will find the foreign functions in that library.
|
link with a library that contains the listed foreign functions, and has the
|
||||||
|
same name as the module.
|
||||||
|
|
||||||
In this case, it'll change the name `crypto` to a shared library name
|
In this case, the Rust compiler changes the name `crypto` to a shared library
|
||||||
in a platform-specific way (`libcrypto.so` on Linux, for example), and
|
name in a platform-specific way (`libcrypto.so` on Linux, for example),
|
||||||
link that in. If you want the module to have a different name from the
|
searches for the shared library with that name, and links the library into the
|
||||||
actual library, you can use the `"link_name"` attribute, like:
|
program. If you want the module to have a different name from the actual
|
||||||
|
library, you can use the `"link_name"` attribute, like:
|
||||||
|
|
||||||
~~~~ {.xfail-test}
|
~~~~ {.xfail-test}
|
||||||
#[link_name = "crypto"]
|
#[link_name = "crypto"]
|
||||||
@ -72,11 +71,11 @@ extern mod something {
|
|||||||
|
|
||||||
# Foreign calling conventions
|
# Foreign calling conventions
|
||||||
|
|
||||||
Most foreign code will be C code, which usually uses the `cdecl` calling
|
Most foreign code is C code, which usually uses the `cdecl` calling
|
||||||
convention, so that is what Rust uses by default when calling foreign
|
convention, so that is what Rust uses by default when calling foreign
|
||||||
functions. Some foreign functions, most notably the Windows API, use other
|
functions. Some foreign functions, most notably the Windows API, use other
|
||||||
calling conventions, so Rust provides a way to hint to the compiler which
|
calling conventions. Rust provides the `"abi"` attribute as a way to hint to
|
||||||
is expected by using the `"abi"` attribute:
|
the compiler which calling convention to use:
|
||||||
|
|
||||||
~~~~
|
~~~~
|
||||||
#[cfg(target_os = "win32")]
|
#[cfg(target_os = "win32")]
|
||||||
@ -86,14 +85,14 @@ extern mod kernel32 {
|
|||||||
}
|
}
|
||||||
~~~~
|
~~~~
|
||||||
|
|
||||||
The `"abi"` attribute applies to a foreign module (it can not be applied
|
The `"abi"` attribute applies to a foreign module (it cannot be applied
|
||||||
to a single function within a module), and must be either `"cdecl"`
|
to a single function within a module), and must be either `"cdecl"`
|
||||||
or `"stdcall"`. Other conventions may be defined in the future.
|
or `"stdcall"`. We may extend the compiler in the future to support other
|
||||||
|
calling conventions.
|
||||||
|
|
||||||
# Unsafe pointers
|
# Unsafe pointers
|
||||||
|
|
||||||
The foreign `SHA1` function is declared to take three arguments, and
|
The foreign `SHA1` function takes three arguments, and returns a pointer.
|
||||||
return a pointer.
|
|
||||||
|
|
||||||
~~~~ {.xfail-test}
|
~~~~ {.xfail-test}
|
||||||
# extern mod crypto {
|
# extern mod crypto {
|
||||||
@ -104,21 +103,20 @@ fn SHA1(src: *u8, sz: libc::c_uint, out: *u8) -> *u8;
|
|||||||
When declaring the argument types to a foreign function, the Rust
|
When declaring the argument types to a foreign function, the Rust
|
||||||
compiler has no way to check whether your declaration is correct, so
|
compiler has no way to check whether your declaration is correct, so
|
||||||
you have to be careful. If you get the number or types of the
|
you have to be careful. If you get the number or types of the
|
||||||
arguments wrong, you're likely to get a segmentation fault. Or,
|
arguments wrong, you're likely to cause a segmentation fault. Or,
|
||||||
probably even worse, your code will work on one platform, but break on
|
probably even worse, your code will work on one platform, but break on
|
||||||
another.
|
another.
|
||||||
|
|
||||||
In this case, `SHA1` is defined as taking two `unsigned char*`
|
In this case, we declare that `SHA1` takes two `unsigned char*`
|
||||||
arguments and one `unsigned long`. The rust equivalents are `*u8`
|
arguments and one `unsigned long`. The Rust equivalents are `*u8`
|
||||||
unsafe pointers and an `uint` (which, like `unsigned long`, is a
|
unsafe pointers and an `uint` (which, like `unsigned long`, is a
|
||||||
machine-word-sized type).
|
machine-word-sized type).
|
||||||
|
|
||||||
Unsafe pointers can be created through various functions in the
|
The standard library provides various functions to create unsafe pointers,
|
||||||
standard lib, usually with `unsafe` somewhere in their name. You can
|
such as those in `core::cast`. Most of these functions have `unsafe` in their
|
||||||
dereference an unsafe pointer with `*` operator, but use
|
name. You can dereference an unsafe pointer with the `*` operator, but use
|
||||||
caution—unlike Rust's other pointer types, unsafe pointers are
|
caution: unlike Rust's other pointer types, unsafe pointers are completely
|
||||||
completely unmanaged, so they might point at invalid memory, or be
|
unmanaged, so they might point at invalid memory, or be null pointers.
|
||||||
null pointers.
|
|
||||||
|
|
||||||
# Unsafe blocks
|
# Unsafe blocks
|
||||||
|
|
||||||
@ -134,12 +132,12 @@ fn sha1(data: ~str) -> ~str {
|
|||||||
let bytes = str::to_bytes(data);
|
let bytes = str::to_bytes(data);
|
||||||
let hash = crypto::SHA1(vec::raw::to_ptr(bytes),
|
let hash = crypto::SHA1(vec::raw::to_ptr(bytes),
|
||||||
vec::len(bytes), ptr::null());
|
vec::len(bytes), ptr::null());
|
||||||
return as_hex(vec::raw::from_buf(hash, 20u));
|
return as_hex(vec::raw::from_buf(hash, 20));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
~~~~
|
~~~~
|
||||||
|
|
||||||
Firstly, what does the `unsafe` keyword at the top of the function
|
First, what does the `unsafe` keyword at the top of the function
|
||||||
mean? `unsafe` is a block modifier—it declares the block following it
|
mean? `unsafe` is a block modifier—it declares the block following it
|
||||||
to be known to be unsafe.
|
to be known to be unsafe.
|
||||||
|
|
||||||
@ -158,8 +156,8 @@ advertise it to the world. An unsafe function is written like this:
|
|||||||
unsafe fn kaboom() { ~"I'm harmless!"; }
|
unsafe fn kaboom() { ~"I'm harmless!"; }
|
||||||
~~~~
|
~~~~
|
||||||
|
|
||||||
This function can only be called from an unsafe block or another
|
This function can only be called from an `unsafe` block or another
|
||||||
unsafe function.
|
`unsafe` function.
|
||||||
|
|
||||||
# Pointer fiddling
|
# Pointer fiddling
|
||||||
|
|
||||||
@ -179,35 +177,36 @@ Let's look at our `sha1` function again.
|
|||||||
let bytes = str::to_bytes(data);
|
let bytes = str::to_bytes(data);
|
||||||
let hash = crypto::SHA1(vec::raw::to_ptr(bytes),
|
let hash = crypto::SHA1(vec::raw::to_ptr(bytes),
|
||||||
vec::len(bytes), ptr::null());
|
vec::len(bytes), ptr::null());
|
||||||
return as_hex(vec::raw::from_buf(hash, 20u));
|
return as_hex(vec::raw::from_buf(hash, 20));
|
||||||
# }
|
# }
|
||||||
# }
|
# }
|
||||||
~~~~
|
~~~~
|
||||||
|
|
||||||
The `str::to_bytes` function is perfectly safe: it converts a string to
|
The `str::to_bytes` function is perfectly safe: it converts a string to a
|
||||||
a `[u8]`. This byte array is then fed to `vec::raw::to_ptr`, which
|
`~[u8]`. The program then feeds this byte array to `vec::raw::to_ptr`, which
|
||||||
returns an unsafe pointer to its contents.
|
returns an unsafe pointer to its contents.
|
||||||
|
|
||||||
This pointer will become invalid as soon as the vector it points into
|
This pointer will become invalid at the end of the scope in which the vector
|
||||||
is cleaned up, so you should be very careful how you use it. In this
|
it points to (`bytes`) is valid, so you should be very careful how you use
|
||||||
case, the local variable `bytes` outlives the pointer, so we're good.
|
it. In this case, the local variable `bytes` outlives the pointer, so we're
|
||||||
|
good.
|
||||||
|
|
||||||
Passing a null pointer as the third argument to `SHA1` makes it use a
|
Passing a null pointer as the third argument to `SHA1` makes it use a
|
||||||
static buffer, and thus save us the effort of allocating memory
|
static buffer, and thus save us the effort of allocating memory
|
||||||
ourselves. `ptr::null` is a generic function that will return an
|
ourselves. `ptr::null` is a generic function that, in this case, returns an
|
||||||
unsafe null pointer of the correct type (Rust generics are awesome
|
unsafe null pointer of type `*u8`. (Rust generics are awesome
|
||||||
like that—they can take the right form depending on the type that they
|
like that: they can take the right form depending on the type that they
|
||||||
are expected to return).
|
are expected to return.)
|
||||||
|
|
||||||
Finally, `vec::raw::from_buf` builds up a new `[u8]` from the
|
Finally, `vec::raw::from_buf` builds up a new `~[u8]` from the
|
||||||
unsafe pointer that was returned by `SHA1`. SHA1 digests are always
|
unsafe pointer that `SHA1` returned. SHA1 digests are always
|
||||||
twenty bytes long, so we can pass `20u` for the length of the new
|
twenty bytes long, so we can pass `20` for the length of the new
|
||||||
vector.
|
vector.
|
||||||
|
|
||||||
# Passing structures
|
# Passing structures
|
||||||
|
|
||||||
C functions often take pointers to structs as arguments. Since Rust
|
C functions often take pointers to structs as arguments. Since Rust
|
||||||
structs are binary-compatible with C structs, Rust programs can call
|
`struct`s are binary-compatible with C structs, Rust programs can call
|
||||||
such functions directly.
|
such functions directly.
|
||||||
|
|
||||||
This program uses the POSIX function `gettimeofday` to get a
|
This program uses the POSIX function `gettimeofday` to get a
|
||||||
@ -241,12 +240,12 @@ fn unix_time_in_microseconds() -> u64 unsafe {
|
|||||||
The `#[nolink]` attribute indicates that there's no foreign library to
|
The `#[nolink]` attribute indicates that there's no foreign library to
|
||||||
link in. The standard C library is already linked with Rust programs.
|
link in. The standard C library is already linked with Rust programs.
|
||||||
|
|
||||||
A `timeval`, in C, is a struct with two 32-bit integers. Thus, we
|
In C, a `timeval` is a struct with two 32-bit integer fields. Thus, we
|
||||||
define a struct type with the same contents, and declare
|
define a `struct` type with the same contents, and declare
|
||||||
`gettimeofday` to take a pointer to such a struct.
|
`gettimeofday` to take a pointer to such a `struct`.
|
||||||
|
|
||||||
The second argument to `gettimeofday` (the time zone) is not used by
|
This program does not use the second argument to `gettimeofday` (the time
|
||||||
this program, so it simply declares it to be a pointer to the nil
|
zone), so the `extern mod` declaration for it simply declares this argument
|
||||||
type. Since all null pointers have the same representation regardless of
|
to be a pointer to the unit type (written `()`). Since all null pointers have
|
||||||
their referent type, this is safe.
|
the same representation regardless of their referent type, this is safe.
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user