Logo
Skip to main content

Rust Programming Language Tutorial (Basics), Part 4

C++

Rust is becoming more widespread among developers who want to create fast and safe software. Apriorit works with the Rust programming language, and our experience is the basis for this tutorial. This article is the last part of our Rust programming language tutorial, which is useful for anyone who wants to get familiar with the basics of Rust.

In the first part of our tutorial, we describe such features as zero-cost abstractions and move semantics. The second part is dedicated to Rust features that guarantee memory safety, and the third part covers such features as threads without data races and trait-based generics.

In this last part, weโ€™ll tell you about pattern matching, automatically deducing types using type inference, and ensuring minimal runtime in the Rust language. Weโ€™ll also explain how you can easily call C from Rust.  

Pattern Matching

Similar to C++, Rust has enumeration types:

Rust
enum Month {
    January, February, March, April, May, June, July,
    August, September, October, November, December,
}

It also has a multiple-choice construction to operate on them:

Rust
match month {
    Month::December | Month::January | Month::February
        => println!(โ€œItโ€™s winter!โ€),
    Month::March | Month::April | Month::May
        => println!(โ€œItโ€™s spring!โ€),
    Month::June | Month::July | Month::August
        => println!(โ€œItโ€™s summer!โ€),
    Month::September | Month::October | Month::November
        => println!(โ€œItโ€™s autumn!โ€),
}

However, match has more features than a simple switch. The most crucial difference is that matching must be exhaustive: the match clause must handle all possible values of the expressions being matched. This eliminates a typical error in which switch statements break when an enumeration is extended later with new values. Of course, thereโ€™s also a default catch-all option that matches any value:

Rust
match number {
    0..9 => println!(โ€œsmall numberโ€),
    10..100 if number % 2 == 0 => {
        println!(โ€œbig even numberโ€);
    }
    _ => println!(โ€œsome other numberโ€),
}

Another important feature of Rust enumerations is that they can carry values, implementing discriminated unions safely.

Rust
enum Color {
    Red, Green, Blue,
    RGB(u32, u32, u32),
    CMYK(u32, u32, u32, u32),
}

Pattern matching can be used to match against possible options and extract values stored in a union:

Rust
match some_color {
    Color::Red => println(โ€œPure redโ€),
    Color::Green => println(โ€œPure greenโ€),
    Color::Blue => println(โ€œPure blueโ€),
    Color::RGB(red, green, blue) => {
        println(โ€œRed: {}, green: {}, blue: {}โ€, red, green, blue);
    }
    Color::CMYK(cyan, magenta, yellow, black) => {
        println(โ€œCyan: {}, magenta: {}, yellow: {}, black: {}โ€,
            cyan, magenta, yellow, black);
    }
}

Unlike C and C++ unions, Rust makes it impossible to choose an incorrect branch when unpacking a union.

Related services

Outsource Software Development in C/C++

Type Inference

Rust uses a static type system, which means that types of variables, function arguments, structure fields, and so on must be known at compile time; the compiler will check that correct types are used everywhere. However, Rust also uses type inference, which allows the compiler to automatically deduce types based on how variables are used.

This is very convenient because you no longer need to explicitly state types, which in some cases may be cumbersome (or impossible) to write. The auto keyword in C++ serves the same purpose:

Rust
std::vector<std::map<std::string, std::vector<Object>>> some_map;
  
// Iterator types can easily become a mess:
for (const auto &it : some_map)
{
    /* ... */
}
  
// Lambda functions can only be used with auto;
// their exact type cannot be expressed in C++
auto compare_by_cost = [](const Foo &lhs, const Foo &rhs) { return a.cost < b.cost };

However, Rust also considers future uses of a variable to deduce its type โ€“ not only the initializer โ€“ allowing programmers to write code like this:

Rust
let v = 10;               // vโ€™s type is some integer (based on the constant),
                          // but the exact type (i32, u8, etc.) is not yet known
  
let mut vec = Vec::new(); // vecโ€™s type is some Vec<T>, where T may be anything
  
vec.push(v);              // after this line, the compiler knows that T == vโ€™s type
  
let s = v + vec.len();    // vec.len() returns โ€œusizeโ€, so this must be the type
                          // of v (as another addend) and s (as a sum), and vec
                          // is now also known to have type Vec<usize>
  
println!(โ€œ{}: {:?}โ€, s, vec); // prints 11: [10]

Rust uses the widely known and thoroughly researched Hindley-Milner inference algorithm. This algorithm is most commonly used in functional programming languages. It can handle global type inference (inferring all types in an entire program, even the types of function arguments, returns, structure fields, etc.) But global type inference can be slow in large projects and can cause types to change with unrelated changes in the code base. Thus, Rust uses inference only for local variables. You must explicitly write types for arguments and structure fields. This strikes a good balance between expressibility, speed, and robustness. Types also make good documentation for functions, methods, and structures.

Read also:
Caching in .NET and SQL Server Notifications

Minimal Runtime

Runtime is the language support library thatโ€™s embedded into every program and provides essential features to the Rust programming language. The Java Virtual Machine can be thought of as the runtime of the Java language, for example, as it provides features like class loading and garbage collection. The size and complexity of the runtime contributes significantly to start-up and runtime overhead. For example, the JVM requires a non-negligible amount of time to load classes, warm up the JIT compiler, collect garbage, and so on.

Rust doesnโ€™t have any garbage collection, virtual machine bytecode interpreter, or lightweight thread scheduler running in background. The code you write is exactly whatโ€™s executed by the CPU. Some parts of the Rust standard library can be considered the โ€œruntime,โ€ providing support for heap allocation, backtraces, stack unwinding, and stack overflow guards. The standard library also has some minor amount of global initialization code, similar to the initialization code of a C runtime library that sets up the stack, calls global constructors, and so on before control is transferred to the main() function. (You can compile Rust programs without the standard library if you donโ€™t need it, thus avoiding this overhead.)

In short, Rust can be used for really low-level work like bare-metal programming, device drivers, and operating system kernels:

Furthermore, the absence of a complex runtime simplifies embedding Rust modules into programs written in other languages. For example, you can easily write JNI code for Java or extensions for dynamic languages like Python, Ruby, or Lua.

Read also:
File System Virtualization โ€“ The New Perspective, part 1

Efficient C Bindings

Thereโ€™s more than one programming language in the world, so itโ€™s not surprising that you might want to use libraries written in languages other than Rust. Conventionally, libraries provide a C API because C is a ubiquitous language, the common denominator of programming languages. Rust is able to easily communicate with C APIs, without any overhead, and use its ownership system to provide significantly stronger safety guarantees for them.

Calling C from Rust

Letโ€™s look at a simple example. Consider the following C library for adding numbers (here we take it easy and use a regular for loop, but we could do something clever with AVX instructions):

Rust
/**
 * Sum some numbers.
 *
 * @param numbers [in] pointer to the numbers to be summed
 *                      must not be NULL and must point to at least
 *                      `count` elements
 * @param count [in]    number of numbers to be summed
 *
 * @returns sum of the provided numbers.
 */
int sum_numbers(const int *numbers, size_t count)
{
    int sum = 0;
  
    for (size_t i = 0; i < count; i++)
    {
        sum += numbers[i];
    }
  
    return sum;
}

Note that some parts of this function API are described formally by the argument types, but some things are only specified in the documentation. For example, we can only infer that we canโ€™t pass NULL for a numbers argument and that there must be at least count numbers available. And only the common sense of a C programmer tells us that the function wonโ€™t call free() for the numbers array.

Hereโ€™s how we can call this function from Rust:

Rust
extern crate libc;
  
extern {
    fn sum_numbers(numbers: *const libc::c_int, count: libc::size_t)
        -> libc::c_int;
}
  
fn main() {
    let array = [1, 2, 3, 4, 5];
    let sum = unsafe { sum_numbers(array.as_ptr(), array.len()) };
    println!(โ€œSum: {}โ€, sum); // ===> prints โ€œ15โ€
}

As you can see, thereโ€™s no syntactical overhead in calling an external function written in C (other than spelling out the prototype of the function). Itโ€™s just like calling a native Rust function. If you look at the generated assembly code, you can see that this function call has no runtime overhead as well:

ShellScript
leaq      32(%rsp), %rdi
movl    $5, %esi
callq     sum_numbers@PLT
movl    %eax, 12(%rsp)

Thereโ€™s no hidden boxing and unboxing, re-allocating of the array, obligatory safety checks, or other things. We see exactly the same machine code that a C compiler would have generated for the same library function call.

Related services

Custom .NET Development Services

The Libc Crate and Unsafe Blocks

However, there are some details in the above code that require further explanation โ€“ first of all, the libc crate. This is a wrapper library that provides types and functions of the C standard library to Rust. Here you can find all the usual types, constants, and functions:

  • libc::c_uint (unsigned int type)
  • libc::stat (struct stat structure)
  • libc::pthread_mutex_t (pthread_mutex_t typedef)
  • libc::open (open(2) system call)
  • libc::reboot (reboot(2) system call)
  • libc::EINVAL (EINVAL constant)
  • libc::SIGSEGV (SIGSEGV constant)
  • and many more, depending on the platform you compile on

Not only can you use โ€œnormalโ€ C libraries via the Rust Foreign Function Interface โ€“ you can also readily use the system API via libc crate.

Another catch lies in the unsafe block:

Rust
let sum = unsafe { sum_numbers(array.as_ptr(), array.len()) };

As the sum_numbers() function is external, it doesnโ€™t automatically provide the degree of safety provided by native Rust functions. For example, Rust will allow you to pass a NULL pointer as the first argument and this will cause an undefined behavior (just as it would in C). The function call isnโ€™t safe, so it must be wrapped in an unsafe block which effectively says โ€œCompiler, you have my word that this function call is safe. I have verified that the arguments are okay, that the function wonโ€™t compromise Rust safety guarantees, and that it wonโ€™t cause undefined behavior.โ€

Just as in C, the programmer is ultimately responsible for guaranteeing that the program doesnโ€™t cause undefined behavior. The difference here is that with C you must manually do this at all times, in all parts of the code, for every library you use. On the other hand, in Rust you must manually verify safety only inside unsafe blocks. All other Rust code (outside unsafe blocks) is automatically safe, as routinely verified by the Rust compiler.

Herein lies the power of Rust: you can provide safe wrappers for unsafe code and thus avoid tedious, manual safety verifications in the consumer code. For example, the sum_numbers function can be wrapped like this:

Rust
fn sum_numbers(numbers: &[libc::c_int]) -> libc::c_int {
    // This is safe because Rust slices are always non-NULL
    // and are guaranteed to be long enough
    unsafe { sum_numbers(numbers.as_ptr(), numbers.len()) }
}

Now the external function has a safe interface. It can be readily used by idiomatic Rust code without unsafe blocks. Callers of the function donโ€™t need to be aware of the actual safety requirements of its native C implementation. And itโ€™s still as fast as the original!

Read also:
How to Automate GUI Testing of Windows Apps with Pywinauto: Expert Advice

Beyond Primitive Types

Aside from primitive types like libc::c_int and pointers, Rust can use other C types as well.

Rust structs can be made compatible with C structs via a #[repr] annotation:

Rust
#[repr(C)]
struct UUID 
{
    time_low:  u32,
    time_mid:  u16,
    time_high: u16,
    sequence:  u16,
    node:      [u8; 6],
};
Rust
struct UUID
{
    uint32_t time_low;
    uint16_t time_mid;
    uint16_t time_high;
    uint16_t sequence;
    uint8_t node[6];
}

Such structures can be passed by value or by pointer to C code, as theyโ€™ll have the same memory layout as their C counterparts used by a C compiler. (Obviously, the fields can only have types that C can understand.)

C unions can also be directly represented in Rust:

Rust
   union TypePun
{
    f: f32,
    i: i32,
};
Rust
   union TypePun
{
    float f;
    int   i;
};

As in C, unions in Rust are untagged. That is, they donโ€™t store the runtime type of the value inside them. The programmer is responsible for accessing union fields correctly. The compiler canโ€™t check this automatically, so Rust unions require an explicit unsafe block when accessing their fields both for reading and writing.

Simple enumerations are also compatible with C:

Rust
enum Options
{
    ONE = 0,
    TWO,
    THREE,
}
Rust
enum Options
{
    ONE,
    TWO,
    THREE,
};

However, you canโ€™t use advanced features of Rust enum types when calling C code. For instance, you canโ€™t directly pass Option<T> or Result<T> values to C.

Rust functions can be converted into C function pointers given that the argument types are actually compatible and the C ABI is used:

Rust
fn launch_native_thread() {
    let name = "Ferris";
    // Weโ€™re going to launch a native thread via pthread_create() from libc.
    // This is an external function, so calling it is unsafe in Rust (think
    // about exception boundaries, for example).
    unsafe {
        let mut thread = 0;
        libc::pthread_create(&mut thread, // out-argument for pthread_t
            ptr::null(),                  // in-argument of pthread_attr_t
            thread_body,                  // thread body (as a C callback)
            mem::transmute(&name)         // thread argument (requires a cast)
        );
        libc::pthread_join(thread, ptr::null_mut());
    }
}
  
// Hereโ€™s our thread body with C ABI written in Rust
extern "C" fn thread_body(arg: *mut libc::c_void) -> *mut libc::c_void {
    // We need to cast the argument back to the original reference to &str.
    // This is unsafe (from the Rust compilerโ€™s point of view), but we know
    // what kind of data we have put into this void*
    let name: &&str = unsafe { mem::transmute(arg) };
    println!("Hello {} from Rust thread!", name);
    return ptr::null_mut();
}

Calling Rust from C

Native Rust functions and types can be made available to C code just as easily as you can call C from Rust. Letโ€™s reverse the example with the sum_numbers() function and implement it in Rust instead:

Rust
#[no_mangle]
pub extern โ€œCโ€ fn sum_numbers(numbers: *const libc::c_int, count: libc::size_t)
    -> libc::c_int
{
    // Convert the C pointer-to-array into a native Rust slice of an array.
    // This is not safe per se because the โ€œnumbersโ€ pointer may be NULL
    // and the โ€œcountโ€ value may not match the actual array length.
    //
    // As with C, weโ€™ll require the caller of this function to ensure
    // that these safety requirements are observed and will not check
    // them explicitly here.
    let rust_slice = unsafe { from_raw_parts(numbers, count) };
  
    // Rust slice types already have a handy method for summing their
    // elements. Letโ€™s use it here.
    return rust_slice.sum();
}

And thatโ€™s it. The #[no_mangle] attribute prevents symbol mangling (so that the function is exported with the exact name โ€œsum_numbersโ€). The extern directive specifies that the function should have the C ABI instead of the native Rust ABI. With this, any C program can link to a library written in Rust and can easily use our function:

Rust
// Declare the function prototype for C
int sum_numbers(const int *numbers, size_t count);
  
int main()
{
    int numbers[] = { 1, 2, 3, 4, 5 };
    int sum = sum_numbers(numbers, 5);
    printf(โ€œSum is %d\nโ€, sum);
}

Calling a Rust library in C is as easy as calling a native C library. There are no required conversions, no Rust VM context needs to be initialized and passed as an additional argument, and thereโ€™s no overhead aside from the regular function call.

Related services

Web Application Development Services & Solutions

Conclusion

As you can see, Rust ensures better safety, concurrency, and speed than other popular languages. This is achieved due to the absence of garbage collection, runtime overhead, and data races, as well as efficient binding with other languages and other useful features.

In our next article dedicated to the Rust language, weโ€™ll compare Rust with another popular programming language: C++.

This Rust programming tutorial is based on the experience of our Apriorit software development team. We would be glad to assist you with Rust development services. Get in touch with us!

Have a question?

Ask our expert!

Tell us about your project

Send us a request for proposal! Weโ€™ll get back to you with details and estimations.

Book an Exploratory Call

Do not have any specific task for us in mind but our skills seem interesting?

Get a quick Apriorit intro to better understand our team capabilities.

Book time slot

Contact us