Rust is becoming more widespread among developers who want to create fast and safe software. Apriorit works with the Rust programming language, and our experience is the basis for this tutorial. This article is the last part of our Rust programming language tutorial, which is useful for anyone who wants to get familiar with the basics of Rust.
In the first part of our tutorial, we describe such features as zero-cost abstractions and move semantics. The second part is dedicated to Rust features that guarantee memory safety, and the third part covers such features as threads without data races and trait-based generics.
In this last part, weโll tell you about pattern matching, automatically deducing types using type inference, and ensuring minimal runtime in the Rust language. Weโll also explain how you can easily call C from Rust.
Contents
Pattern Matching
Similar to C++, Rust has enumeration types:
enum Month {
January, February, March, April, May, June, July,
August, September, October, November, December,
}
It also has a multiple-choice construction to operate on them:
match month {
Month::December | Month::January | Month::February
=> println!(โItโs winter!โ),
Month::March | Month::April | Month::May
=> println!(โItโs spring!โ),
Month::June | Month::July | Month::August
=> println!(โItโs summer!โ),
Month::September | Month::October | Month::November
=> println!(โItโs autumn!โ),
}
However, match has more features than a simple switch. The most crucial difference is that matching must be exhaustive: the match clause must handle all possible values of the expressions being matched. This eliminates a typical error in which switch statements break when an enumeration is extended later with new values. Of course, thereโs also a default catch-all option that matches any value:
match number {
0..9 => println!(โsmall numberโ),
10..100 if number % 2 == 0 => {
println!(โbig even numberโ);
}
_ => println!(โsome other numberโ),
}
Another important feature of Rust enumerations is that they can carry values, implementing discriminated unions safely.
enum Color {
Red, Green, Blue,
RGB(u32, u32, u32),
CMYK(u32, u32, u32, u32),
}
Pattern matching can be used to match against possible options and extract values stored in a union:
match some_color {
Color::Red => println(โPure redโ),
Color::Green => println(โPure greenโ),
Color::Blue => println(โPure blueโ),
Color::RGB(red, green, blue) => {
println(โRed: {}, green: {}, blue: {}โ, red, green, blue);
}
Color::CMYK(cyan, magenta, yellow, black) => {
println(โCyan: {}, magenta: {}, yellow: {}, black: {}โ,
cyan, magenta, yellow, black);
}
}
Unlike C and C++ unions, Rust makes it impossible to choose an incorrect branch when unpacking a union.
Type Inference
Rust uses a static type system, which means that types of variables, function arguments, structure fields, and so on must be known at compile time; the compiler will check that correct types are used everywhere. However, Rust also uses type inference, which allows the compiler to automatically deduce types based on how variables are used.
This is very convenient because you no longer need to explicitly state types, which in some cases may be cumbersome (or impossible) to write. The auto keyword in C++ serves the same purpose:
std::vector<std::map<std::string, std::vector<Object>>> some_map;
// Iterator types can easily become a mess:
for (const auto &it : some_map)
{
/* ... */
}
// Lambda functions can only be used with auto;
// their exact type cannot be expressed in C++
auto compare_by_cost = [](const Foo &lhs, const Foo &rhs) { return a.cost < b.cost };
However, Rust also considers future uses of a variable to deduce its type โ not only the initializer โ allowing programmers to write code like this:
let v = 10; // vโs type is some integer (based on the constant),
// but the exact type (i32, u8, etc.) is not yet known
let mut vec = Vec::new(); // vecโs type is some Vec<T>, where T may be anything
vec.push(v); // after this line, the compiler knows that T == vโs type
let s = v + vec.len(); // vec.len() returns โusizeโ, so this must be the type
// of v (as another addend) and s (as a sum), and vec
// is now also known to have type Vec<usize>
println!(โ{}: {:?}โ, s, vec); // prints 11: [10]
Rust uses the widely known and thoroughly researched Hindley-Milner inference algorithm. This algorithm is most commonly used in functional programming languages. It can handle global type inference (inferring all types in an entire program, even the types of function arguments, returns, structure fields, etc.) But global type inference can be slow in large projects and can cause types to change with unrelated changes in the code base. Thus, Rust uses inference only for local variables. You must explicitly write types for arguments and structure fields. This strikes a good balance between expressibility, speed, and robustness. Types also make good documentation for functions, methods, and structures.
Minimal Runtime
Runtime is the language support library thatโs embedded into every program and provides essential features to the Rust programming language. The Java Virtual Machine can be thought of as the runtime of the Java language, for example, as it provides features like class loading and garbage collection. The size and complexity of the runtime contributes significantly to start-up and runtime overhead. For example, the JVM requires a non-negligible amount of time to load classes, warm up the JIT compiler, collect garbage, and so on.
Rust doesnโt have any garbage collection, virtual machine bytecode interpreter, or lightweight thread scheduler running in background. The code you write is exactly whatโs executed by the CPU. Some parts of the Rust standard library can be considered the โruntime,โ providing support for heap allocation, backtraces, stack unwinding, and stack overflow guards. The standard library also has some minor amount of global initialization code, similar to the initialization code of a C runtime library that sets up the stack, calls global constructors, and so on before control is transferred to the main() function. (You can compile Rust programs without the standard library if you donโt need it, thus avoiding this overhead.)
In short, Rust can be used for really low-level work like bare-metal programming, device drivers, and operating system kernels:
- Rust Embedded (https://github.com/rust-embedded)
- rust.ko (https://github.com/tsgates/rust.ko)
- Windows KMD (https://github.com/pravic/winapi-kmd-rs)
- Redox OS (https://www.redox-os.org/)
Furthermore, the absence of a complex runtime simplifies embedding Rust modules into programs written in other languages. For example, you can easily write JNI code for Java or extensions for dynamic languages like Python, Ruby, or Lua.
Efficient C Bindings
Thereโs more than one programming language in the world, so itโs not surprising that you might want to use libraries written in languages other than Rust. Conventionally, libraries provide a C API because C is a ubiquitous language, the common denominator of programming languages. Rust is able to easily communicate with C APIs, without any overhead, and use its ownership system to provide significantly stronger safety guarantees for them.
Calling C from Rust
Letโs look at a simple example. Consider the following C library for adding numbers (here we take it easy and use a regular for loop, but we could do something clever with AVX instructions):
/**
* Sum some numbers.
*
* @param numbers [in] pointer to the numbers to be summed
* must not be NULL and must point to at least
* `count` elements
* @param count [in] number of numbers to be summed
*
* @returns sum of the provided numbers.
*/
int sum_numbers(const int *numbers, size_t count)
{
int sum = 0;
for (size_t i = 0; i < count; i++)
{
sum += numbers[i];
}
return sum;
}
Note that some parts of this function API are described formally by the argument types, but some things are only specified in the documentation. For example, we can only infer that we canโt pass NULL for a numbers argument and that there must be at least count numbers available. And only the common sense of a C programmer tells us that the function wonโt call free() for the numbers array.
Hereโs how we can call this function from Rust:
extern crate libc;
extern {
fn sum_numbers(numbers: *const libc::c_int, count: libc::size_t)
-> libc::c_int;
}
fn main() {
let array = [1, 2, 3, 4, 5];
let sum = unsafe { sum_numbers(array.as_ptr(), array.len()) };
println!(โSum: {}โ, sum); // ===> prints โ15โ
}
As you can see, thereโs no syntactical overhead in calling an external function written in C (other than spelling out the prototype of the function). Itโs just like calling a native Rust function. If you look at the generated assembly code, you can see that this function call has no runtime overhead as well:
leaq 32(%rsp), %rdi
movl $5, %esi
callq sum_numbers@PLT
movl %eax, 12(%rsp)
Thereโs no hidden boxing and unboxing, re-allocating of the array, obligatory safety checks, or other things. We see exactly the same machine code that a C compiler would have generated for the same library function call.
The Libc Crate and Unsafe Blocks
However, there are some details in the above code that require further explanation โ first of all, the libc crate. This is a wrapper library that provides types and functions of the C standard library to Rust. Here you can find all the usual types, constants, and functions:
- libc::c_uint (unsigned int type)
- libc::stat (struct stat structure)
- libc::pthread_mutex_t (pthread_mutex_t typedef)
- libc::open (open(2) system call)
- libc::reboot (reboot(2) system call)
- libc::EINVAL (EINVAL constant)
- libc::SIGSEGV (SIGSEGV constant)
- and many more, depending on the platform you compile on
Not only can you use โnormalโ C libraries via the Rust Foreign Function Interface โ you can also readily use the system API via libc crate.
Another catch lies in the unsafe block:
let sum = unsafe { sum_numbers(array.as_ptr(), array.len()) };
As the sum_numbers() function is external, it doesnโt automatically provide the degree of safety provided by native Rust functions. For example, Rust will allow you to pass a NULL pointer as the first argument and this will cause an undefined behavior (just as it would in C). The function call isnโt safe, so it must be wrapped in an unsafe block which effectively says โCompiler, you have my word that this function call is safe. I have verified that the arguments are okay, that the function wonโt compromise Rust safety guarantees, and that it wonโt cause undefined behavior.โ
Just as in C, the programmer is ultimately responsible for guaranteeing that the program doesnโt cause undefined behavior. The difference here is that with C you must manually do this at all times, in all parts of the code, for every library you use. On the other hand, in Rust you must manually verify safety only inside unsafe blocks. All other Rust code (outside unsafe blocks) is automatically safe, as routinely verified by the Rust compiler.
Herein lies the power of Rust: you can provide safe wrappers for unsafe code and thus avoid tedious, manual safety verifications in the consumer code. For example, the sum_numbers function can be wrapped like this:
fn sum_numbers(numbers: &[libc::c_int]) -> libc::c_int {
// This is safe because Rust slices are always non-NULL
// and are guaranteed to be long enough
unsafe { sum_numbers(numbers.as_ptr(), numbers.len()) }
}
Now the external function has a safe interface. It can be readily used by idiomatic Rust code without unsafe blocks. Callers of the function donโt need to be aware of the actual safety requirements of its native C implementation. And itโs still as fast as the original!
Beyond Primitive Types
Aside from primitive types like libc::c_int and pointers, Rust can use other C types as well.
Rust structs can be made compatible with C structs via a #[repr] annotation:
#[repr(C)]
struct UUID
{
time_low: u32,
time_mid: u16,
time_high: u16,
sequence: u16,
node: [u8; 6],
};
struct UUID
{
uint32_t time_low;
uint16_t time_mid;
uint16_t time_high;
uint16_t sequence;
uint8_t node[6];
}
Such structures can be passed by value or by pointer to C code, as theyโll have the same memory layout as their C counterparts used by a C compiler. (Obviously, the fields can only have types that C can understand.)
C unions can also be directly represented in Rust:
union TypePun
{
f: f32,
i: i32,
};
union TypePun
{
float f;
int i;
};
As in C, unions in Rust are untagged. That is, they donโt store the runtime type of the value inside them. The programmer is responsible for accessing union fields correctly. The compiler canโt check this automatically, so Rust unions require an explicit unsafe block when accessing their fields both for reading and writing.
Simple enumerations are also compatible with C:
enum Options
{
ONE = 0,
TWO,
THREE,
}
enum Options
{
ONE,
TWO,
THREE,
};
However, you canโt use advanced features of Rust enum types when calling C code. For instance, you canโt directly pass Option<T> or Result<T> values to C.
Rust functions can be converted into C function pointers given that the argument types are actually compatible and the C ABI is used:
fn launch_native_thread() {
let name = "Ferris";
// Weโre going to launch a native thread via pthread_create() from libc.
// This is an external function, so calling it is unsafe in Rust (think
// about exception boundaries, for example).
unsafe {
let mut thread = 0;
libc::pthread_create(&mut thread, // out-argument for pthread_t
ptr::null(), // in-argument of pthread_attr_t
thread_body, // thread body (as a C callback)
mem::transmute(&name) // thread argument (requires a cast)
);
libc::pthread_join(thread, ptr::null_mut());
}
}
// Hereโs our thread body with C ABI written in Rust
extern "C" fn thread_body(arg: *mut libc::c_void) -> *mut libc::c_void {
// We need to cast the argument back to the original reference to &str.
// This is unsafe (from the Rust compilerโs point of view), but we know
// what kind of data we have put into this void*
let name: &&str = unsafe { mem::transmute(arg) };
println!("Hello {} from Rust thread!", name);
return ptr::null_mut();
}
Calling Rust from C
Native Rust functions and types can be made available to C code just as easily as you can call C from Rust. Letโs reverse the example with the sum_numbers() function and implement it in Rust instead:
#[no_mangle]
pub extern โCโ fn sum_numbers(numbers: *const libc::c_int, count: libc::size_t)
-> libc::c_int
{
// Convert the C pointer-to-array into a native Rust slice of an array.
// This is not safe per se because the โnumbersโ pointer may be NULL
// and the โcountโ value may not match the actual array length.
//
// As with C, weโll require the caller of this function to ensure
// that these safety requirements are observed and will not check
// them explicitly here.
let rust_slice = unsafe { from_raw_parts(numbers, count) };
// Rust slice types already have a handy method for summing their
// elements. Letโs use it here.
return rust_slice.sum();
}
And thatโs it. The #[no_mangle] attribute prevents symbol mangling (so that the function is exported with the exact name โsum_numbersโ). The extern directive specifies that the function should have the C ABI instead of the native Rust ABI. With this, any C program can link to a library written in Rust and can easily use our function:
// Declare the function prototype for C
int sum_numbers(const int *numbers, size_t count);
int main()
{
int numbers[] = { 1, 2, 3, 4, 5 };
int sum = sum_numbers(numbers, 5);
printf(โSum is %d\nโ, sum);
}
Calling a Rust library in C is as easy as calling a native C library. There are no required conversions, no Rust VM context needs to be initialized and passed as an additional argument, and thereโs no overhead aside from the regular function call.
Conclusion
As you can see, Rust ensures better safety, concurrency, and speed than other popular languages. This is achieved due to the absence of garbage collection, runtime overhead, and data races, as well as efficient binding with other languages and other useful features.
In our next article dedicated to the Rust language, weโll compare Rust with another popular programming language: C++.
This Rust programming tutorial is based on the experience of our Apriorit software development team. We would be glad to assist you with Rust development services. Get in touch with us!