Though it’s quite difficult to create a programming language better than C, C++, or Java, Mozilla has managed to develop a language that can ensure better security and privacy on the internet. Rust, which only appeared in 2010, has already become one of the most-loved languages by programmers. Thanks to its innovative features, the language allows novice developers to mitigate security vulnerabilities and benefit from faster software performance.

This Rust programming language tutorial based on our experience at Apriorit will provide you with a deep look into Rust features and their practical application. This four-article series will be useful for programmers who wish to know more about the options that the Rust language provides.

 

Written by:

Alexey Lozovsky,

Software Designer in System Programming Team

 

Contents:

Introduction

Summary of Features

Rust Language Features

    Zero-Cost Abstractions

    Move semantics

Conclusion

Introduction

Rust is focused on safety, speed, and concurrency. Its design allows you to develop software with great performance by controlling a low-level language using the powerful abstractions of a high-level language. This makes Rust both a safer alternative to languages like C and C++ and a faster alternative to languages like Python and Ruby.

The majority of safety checks and memory management decisions are performed by the Rust compiler so the program’s runtime performance isn’t slowed down by them. This makes Rust a great choice for use cases where more secure languages like Java aren’t good:

  • Programs with predictable resource requirements
  • Embedded software
  • Low-level code like device drivers

Rust can be used for web applications as well as for backend operations due to the many libraries that are available through the Cargo package registry.

Summary of Features

Before describing the features of Rust, we’d like to mention some issues that the language successfully manages.

Issue Rust’s Solution
Preferring code duplication to abstraction due to high cost of virtual method calls Zero-cost abstraction mechanisms
Use-after-free, double-free bugs, dangling pointers Smart pointers and references avoid these issues by design

Compile-time restrictions on raw pointer usage
Null dereference errors Optional types as a safe alternative to nullable pointers
Buffer overflow errors Range checks performed at runtime

Checks are avoided where the compiler can prove they’re unnecessary
Data races Built-in static analysis detects and prevents possible data races at compilation time
Uninitialized variables Compiler requires all variables to be initialized before first use

All types have defined default values
Legacy design of utility types heavily used by the standard library Built-in, composable, structured types: tuples, structures, enumerations

Pattern matching allows convenient use of structured types

The standard library fully embraces available pattern matching to provide easy-to-use interfaces
Embedded and bare-metal programming place high restrictions on runtime environment Minimal runtime size (which can be reduced even further)

Absence of built-in garbage collector, thread scheduler, or virtual machine
Using existing libraries written in C and other languages Only header declarations are needed to call C functions from Rust, or vice versa

No overhead in calling C functions from Rust or calling Rust functions from C

Now let’s look more closely at the features provided by the Rust programming language and see how they’re useful for developing system software.

Rust Language Features

In the first article of this Rust language programming tutorial, we’ll describe such two key features as zero-cost abstractions and move semantics.

Zero-Cost Abstractions

Zero-cost (or zero-overhead) abstractions are one of the most important features explored by C++. Bjarne Stroustrup, the creator of C++, describes them as follows:

“What you don’t use, you don’t pay for.” And further: “What you do use, you couldn’t hand code any better.”

Abstraction is a great tool used by Rust developers to deal with complex code. Generally, abstraction comes with runtime costs because abstracted code is less efficient than specific code. However, with clever language design and compiler optimizations, some abstractions can be made to have effectively zero runtime cost. The usual sources of these optimizations are static polymorphism (templates) and aggressive inlining, both of which Rust embraces fully.

Iterators are an example of commonly used (and thus heavily optimized) abstractions that they decouple algorithms for sequences of values from the concrete containers holding those values. Rust iterators provide many built-in combinators for manipulating data sequences, enabling concise expressions of a programmer’s intent. Consider the following code:

// Here we have two sequences of data. These could be stored in vectors
// or linked lists or whatever. Here we have _slices_ (references to arrays):
let data1 = &[3, 1, 4, 1, 5, 9, 2, 6];
let data2 = &[2, 7, 1, 8, 2, 8, 1, 8];
 
// Let’s compute some valuable results from them!
let numbers =
    // By iterating over the first array:
    data1.iter()            // {3,      1,      4,      ...}
    // Then zipping this iterator with an iterator over another array,
    // resulting in an iterator over pairs of numbers:
    .zip(data2.iter())      // {(3, 2), (1, 7), (4, 1), ...}
    // After that we map each pair into the product of its elements
    // via a lambda function and get an iterator over products:
    .map(|(a, b)| a * b)    // {6,      7,      4,      ...}
    // Given that, we filter some of the results with a predicate:
    .filter(|n| *n > 5)     // {6,      7,              ...}
    // And take no more than 4 of the entire sequence which is produced
    // by the iterator constructed to this point:
    .take(4)
    // Finally, we collect the results into a vector. This is
    // the point where the iteration is actually performed:
    .collect::<Vec<_>>();
 
// And here is what we can see if we print out the resulting vector:
println!("{:?}", numbers);  // ===> [6, 7, 8, 10]

Combinators use high-level concepts such as closures and lambda functions that have significant costs if compiled natively. However, due to optimizations powered by LLVM, this code compiles as efficiently as the explicit hand-coded version shown here:

use std::cmp::min;
 
let mut numbers = Vec::new();
 
for i in 0..min(data1.len(), data2.len()) {
    let n = data1[i] * data2[i];
 
    if n > 5 {
        numbers.push(n);
    }
 
    if numbers.len() == 4 {
        break;
    }
}

While this version is more explicit in what it does, the code using combinators is easier to understand and maintain. Switching the type of container where values are collected requires changes in only one line with combinators versus three in the expanded version. Adding new conditions and transformations is also less error-prone.

Iterators are Rust examples of “couldn’t hand code better” parts. Smart pointers are an example of the “don’t pay for what you don’t use” approach in Rust.

The C++ standard library has a shared_ptr template class that’s used to express shared ownership of an object. Internally, it uses reference counting to keep track of an object’s lifetime. An object is destroyed when its last shared_ptr is destroyed and the count drops to zero.

Note that objects may be shared between threads, so we need to avoid data races in reference count updates. One thread must not destroy an object while it’s still in use by another thread. And two threads must not concurrently destroy the same object. Thread safety can be ensured by using atomic operations to update the reference counter.

However, some objects (e.g. tree nodes) may need shared ownership but may not need to be shared between threads. Atomic operations are unnecessary overhead in this case. It may be possible to implement some non_atomic_shared_ptr class, but accidentally sharing it between threads (for example, as part of some larger data structure) can lead to hard-to-track bugs. Therefore, the designers of the Standard Template Library chose not to provide a single-threaded option.

On the other hand, Rust is able to distinguish these use cases safely and provides two reference-counted wrappers: Rc for single-threaded use and Arc with an atomic counter. The cherry on top is the ability of the Rust compiler to ensure at compilation time that Rcs are never shared between threads (more on this later). Therefore, it’s not possible to accidentally share data that isn’t meant to be shared and we can be freed from the unnecessary overhead of atomic operations.

Move Semantics

C++11 has brought move semantics into the language. This is a source of countless optimizations and safety improvements in libraries and programs by avoiding unnecessary copying of temporary values, enabling safe storage of non-copyable objects like mutexes in containers, and more.

Rust recognizes the success of move semantics and embraces them by default. That is, all values are in fact moved when they’re assigned to a different variable:

let foo = Foo::new();
let bar = foo;          // the Foo is now in bar

The punchline here is that after the move, you generally can’t use the previous location of the value (foo in our case) because no value remains there. But C++ doesn’t make this an error. Instead, it declares foo to have an unspecified value (defined by the move constructor). In some cases, you can still safely use the variable (like with primitive types). In other cases, you shouldn’t (like with mutexes).

Some compilers may issue a diagnostic warning if you do something wrong. But the standard doesn’t require C++ compilers to do so, as use-after-move may be perfectly safe. Or it may not be and might instead lead to an undefined behavior. It’s the programmer’s responsibility to know when use-after-move breaks and to avoid writing programs that break.

On the other hand, Rust has a more advanced type system and it’s a compilation error to use a value after it has been moved, no matter how complex the control flow or data structure:

error[E0382]: use of moved value: `foo`
  --> src/main.rs:13:1
   |
11 | let bar = foo;
   |     --- value moved here
12 |
13 | foo.some_method();
   | ^^^ value used here after move
   |

Thus, use-after-move errors aren’t possible in Rust.

In fact, the Rust type system allows programmers to safely encode more use cases than they can with C++. Consider converting between various value representations. Let’s say you have a string in UTF-8 and you want to convert it to a corresponding vector of bytes for further processing. You don’t need the original string afterwards. In C++, the only safe option is to copy the whole string using the vector copy constructor:

std::string string = “Hello, world!”;
std::vector<uint8_t> bytes(string.begin(), string.end());

However, Rust allows you to move the internal buffer of the string into a new vector, making the conversion efficient and disallowing use of the original string afterwards:

let string = String::from_str(“Hello, world!”);
let bytes = string.into_bytes();        // string may not be used now

Now, you may think that it’s dumb to move all values by default. For example, when doing arithmetic we expect that we can reuse the results of intermediate calculations and that an individual constant may be used more than once in the program. Rust makes it possible to copy a value implicitly when it’s assigned to a new variable, based on its type. Numbers are an example of such copyable type, and any user-defined type can also be marked as copyable with the #[derive(Copy)] attribute.

Conclusion

Considering the increasing popularity of the Rust programming language, we wanted to share our experience in software development using Rust. In the second article of this series, we’ll describe the language’s memory safety features including ownership, borrowing, mutability, null pointer alternatives, and the absence of uninitialized variables.

Subscribe to updates