Rust is becoming more widespread among developers who want to create fast and safe software. Apriorit works with the Rust programming language, and our experience is the basis for this tutorial. This article is the last part of our Rust programming language tutorial, which is useful for anyone who wants to get familiar with the basics of Rust.
In the first part of our tutorial, we describe such features as zero-cost abstractions and move semantics. The second part is dedicated to Rust features that guarantee memory safety, and the third part covers such features as threads without data races and trait-based generics.
In this last part, we’ll tell you about pattern matching, automatically deducing types using type inference, and ensuring minimal runtime in the Rust language. We’ll also explain how you can easily call C from Rust.
Software Designer in System Programming Team
Similar to C++, Rust has enumeration types:
It also has a multiple-choice construction to operate on them:
However, match has more features than a simple switch. The most crucial difference is that matching must be exhaustive: the match clause must handle all possible values of the expressions being matched. This eliminates a typical error in which switch statements break when an enumeration is extended later with new values. Of course, there’s also a default catch-all option that matches any value:
Another important feature of Rust enumerations is that they can carry values, implementing discriminated unions safely.
Pattern matching can be used to match against possible options and extract values stored in a union:
Unlike C and C++ unions, Rust makes it impossible to choose an incorrect branch when unpacking a union.
Rust uses a static type system, which means that types of variables, function arguments, structure fields, and so on must be known at compile time; the compiler will check that correct types are used everywhere. However, Rust also uses type inference, which allows the compiler to automatically deduce types based on how variables are used.
This is very convenient because you no longer need to explicitly state types, which in some cases may be cumbersome (or impossible) to write. The auto keyword in C++ serves the same purpose:
However, Rust also considers future uses of a variable to deduce its type – not only the initializer – allowing programmers to write code like this:
Rust uses the widely known and thoroughly researched Hindley-Milner inference algorithm. This algorithm is most commonly used in functional programming languages. It can handle global type inference (inferring all types in an entire program, even the types of function arguments, returns, structure fields, etc.) But global type inference can be slow in large projects and can cause types to change with unrelated changes in the code base. Thus, Rust uses inference only for local variables. You must explicitly write types for arguments and structure fields. This strikes a good balance between expressibility, speed, and robustness. Types also make good documentation for functions, methods, and structures.
Runtime is the language support library that’s embedded into every program and provides essential features to the Rust programming language. The Java Virtual Machine can be thought of as the runtime of the Java language, for example, as it provides features like class loading and garbage collection. The size and complexity of the runtime contributes significantly to start-up and runtime overhead. For example, the JVM requires a non-negligible amount of time to load classes, warm up the JIT compiler, collect garbage, and so on.
Rust doesn’t have any garbage collection, virtual machine bytecode interpreter, or lightweight thread scheduler running in background. The code you write is exactly what’s executed by the CPU. Some parts of the Rust standard library can be considered the “runtime,” providing support for heap allocation, backtraces, stack unwinding, and stack overflow guards. The standard library also has some minor amount of global initialization code, similar to the initialization code of a C runtime library that sets up the stack, calls global constructors, and so on before control is transferred to the main() function. (You can compile Rust programs without the standard library if you don’t need it, thus avoiding this overhead.)
In short, Rust can be used for really low-level work like bare-metal programming, device drivers, and operating system kernels:
- Rust Embedded (https://github.com/rust-embedded)
- rust.ko (https://github.com/tsgates/rust.ko)
- Windows KMD (https://github.com/pravic/winapi-kmd-rs)
- Redox OS (https://www.redox-os.org/)
Furthermore, the absence of a complex runtime simplifies embedding Rust modules into programs written in other languages. For example, you can easily write JNI code for Java or extensions for dynamic languages like Python, Ruby, or Lua.
There’s more than one programming language in the world, so it’s not surprising that you might want to use libraries written in languages other than Rust. Conventionally, libraries provide a C API because C is a ubiquitous language, the common denominator of programming languages. Rust is able to easily communicate with C APIs, without any overhead, and use its ownership system to provide significantly stronger safety guarantees for them.
Let’s look at a simple example. Consider the following C library for adding numbers (here we take it easy and use a regular for loop, but we could do something clever with AVX instructions):
Note that some parts of this function API are described formally by the argument types, but some things are only specified in the documentation. For example, we can only infer that we can’t pass NULL for a numbers argument and that there must be at least count numbers available. And only the common sense of a C programmer tells us that the function won’t call free() for the numbers array.
Here’s how we can call this function from Rust:
As you can see, there’s no syntactical overhead in calling an external function written in C (other than spelling out the prototype of the function). It’s just like calling a native Rust function. If you look at the generated assembly code, you can see that this function call has no runtime overhead as well:
There’s no hidden boxing and unboxing, re-allocating of the array, obligatory safety checks, or other things. We see exactly the same machine code that a C compiler would have generated for the same library function call.
However, there are some details in the above code that require further explanation – first of all, the libc crate. This is a wrapper library that provides types and functions of the C standard library to Rust. Here you can find all the usual types, constants, and functions:
- libc::c_uint (unsigned int type)
- libc::stat (struct stat structure)
- libc::pthread_mutex_t (pthread_mutex_t typedef)
- libc::open (open(2) system call)
- libc::reboot (reboot(2) system call)
- libc::EINVAL (EINVAL constant)
- libc::SIGSEGV (SIGSEGV constant)
- and many more, depending on the platform you compile on
Not only can you use “normal” C libraries via the Rust Foreign Function Interface – you can also readily use the system API via libc crate.
Another catch lies in the unsafe block:
As the sum_numbers() function is external, it doesn’t automatically provide the degree of safety provided by native Rust functions. For example, Rust will allow you to pass a NULL pointer as the first argument and this will cause an undefined behavior (just as it would in C). The function call isn’t safe, so it must be wrapped in an unsafe block which effectively says “Compiler, you have my word that this function call is safe. I have verified that the arguments are okay, that the function won’t compromise Rust safety guarantees, and that it won’t cause undefined behavior.”
Just as in C, the programmer is ultimately responsible for guaranteeing that the program doesn’t cause undefined behavior. The difference here is that with C you must manually do this at all times, in all parts of the code, for every library you use. On the other hand, in Rust you must manually verify safety only inside unsafe blocks. All other Rust code (outside unsafe blocks) is automatically safe, as routinely verified by the Rust compiler.
Herein lies the power of Rust: you can provide safe wrappers for unsafe code and thus avoid tedious, manual safety verifications in the consumer code. For example, the sum_numbers function can be wrapped like this:
Now the external function has a safe interface. It can be readily used by idiomatic Rust code without unsafe blocks. Callers of the function don’t need to be aware of the actual safety requirements of its native C implementation. And it’s still as fast as the original!
Aside from primitive types like libc::c_int and pointers, Rust can use other C types as well.
Rust structs can be made compatible with C structs via a #[repr] annotation:
| || |
Such structures can be passed by value or by pointer to C code, as they’ll have the same memory layout as their C counterparts used by a C compiler. (Obviously, the fields can only have types that C can understand.)
C unions can also be directly represented in Rust:
| || |
As in C, unions in Rust are untagged. That is, they don’t store the runtime type of the value inside them. The programmer is responsible for accessing union fields correctly. The compiler can’t check this automatically, so Rust unions require an explicit unsafe block when accessing their fields both for reading and writing.
Simple enumerations are also compatible with C:
| || |
However, you can’t use advanced features of Rust enum types when calling C code. For instance, you can’t directly pass Option<T> or Result<T> values to C.
Rust functions can be converted into C function pointers given that the argument types are actually compatible and the C ABI is used:
Native Rust functions and types can be made available to C code just as easily as you can call C from Rust. Let’s reverse the example with the sum_numbers() function and implement it in Rust instead:
And that’s it. The #[no_mangle] attribute prevents symbol mangling (so that the function is exported with the exact name “sum_numbers”). The extern directive specifies that the function should have the C ABI instead of the native Rust ABI. With this, any C program can link to a library written in Rust and can easily use our function:
Calling a Rust library in C is as easy as calling a native C library. There are no required conversions, no Rust VM context needs to be initialized and passed as an additional argument, and there’s no overhead aside from the regular function call.
As you can see, Rust ensures better safety, concurrency, and speed than other popular languages. This is achieved due to the absence of garbage collection, runtime overhead, and data races, as well as efficient binding with other languages and other useful features.
In our next article dedicated to the Rust language, we’ll compare Rust with another popular programming language: C++.
This Rust programming tutorial is based on the experience of our Apriorit software development team. We would be glad to assist you with software programming in Rust. Get in touch with us!