Whenever Rust is used for systems software, there is always an argument that goes along the lines of "there is so much unsafe
everywhere, what even is the point of using Rust?"
This of course, implies that the alternative, C++, is a good and sane language that we should continue to use.
Take this example of some C++ code. There is no scary raw pointers or anything here. Since C++ is a good and sane language, we would expect this code to simply loop forever.
#include <stdio.h>
[[gnu::noinline]]
void
ub(void)
{
asm volatile ("" : : ); // so the call isn't optimized away
while (1) { }
}
void
unreachable(void)
{
puts("oh no!");
}
int
main(void)
{
ub();
return 0;
}
However, when we compile it with Clang++ 14.0.6 and run the output:
$ ./ub
oh no!
Segmentation fault
Naturally, we should've known, that halting in an infinite loop like that of Undefined Behavior. This is even well-defined in C, but C++ introduces this safety regression.
It's not ideal that systems libraries have lots of unsafe
everywhere, but that is the reality of a lot of systems programming. I would rather have clearly signposted sketchy parts of the program rather than the whole program be the sketchy part of the program.
It is certainly the case that Modern C++ is safer than older C++, but it still has some sketchy bits. Creating objects, destroying objects, indexing vectors, concurrency, popping from queues, iterators, loops, std::optional
, and the regular addition operator all can trivially cause Undefined Behavior if used slightly incorrectly.
Optimizations
C++ and Rust share the principle of Zero-Cost Abstractions. Depending on who you ask, this means at least one of the following.
- Not using an abstraction does not impose any runtime cost
- Using an abstraction is just as fast as implementing it by hand
I believe both languages do a pretty good job of meeting the first goal (we'll get to some failures later), but the second is where Rust's semantics provide a notable advantage.
std::unique_ptr
vs Box
RAII enables resource reclamation when a binding goes out of scope. The most obvious way to do this is with a heap allocation that is automatically freed at the end of its lifetime.
Let's look at an example of this in C++ and Rust, with a C implementation for good measure.
// C++ implementation
#include <memory>
#include <utility>
extern void library_function(std::unique_ptr<int>) noexcept;
void func() {
auto var = std::make_unique<int>(5);
library_function(std::move(var));
}
// Rust implementation
extern "Rust" {
fn library_function(x: Box<i32>);
}
#[no_mangle]
pub fn func() {
let var = Box::new(5);
// this is unsafe because the compiler can't verify extern functions
unsafe { library_function(var) };
}
// C implementation
#include<stdlib.h>
extern void library_function(int *);
void
func(void)
{
int *var = malloc(sizeof(int));
if (!var) {
abort();
}
*var = 5;
library_function(var);
}
These functions each create an owned heap allocation and call an external library function. Since this is as simple as it gets, we'd expect effectively identical assembly. I compiled each of these with the latest stable LLVM-based compiler on godbolt.org: Rustc 1.77.0 for Rust and Clang 18.1.0 for C and C++.
Let's start with the assembly for the C implementation to establish our expectations.
func:
push rax
mov edi, 4
call malloc
test rax, rax
je .LBB0_1
mov dword ptr [rax], 5
mov rdi, rax
pop rax
jmp library_function
.LBB0_1:
call abort
This looks exactly what we would expect. We call malloc, test for failure, load our value, and call our library function.
Next, let's look at the Rust assembly.
func:
push rax
movzx eax, byte ptr [rip + __rust_no_alloc_shim_is_unstable]
mov edi, 4
mov esi, 4
call __rust_alloc
test rax, rax
je .LBB0_1
mov dword ptr [rax], 5
mov rdi, rax
pop rax
jmp library_function
.LBB0_1:
mov edi, 4
mov esi, 4
call alloc::handle_alloc_error
I've cleaned up some symbol names for clarity, but there still seems like there's a little bit more here.
First, there's this load from __rust_no_alloc_shim_is_unstable
. This exists to force the linker to error if you try to use the corresponding unstable feature. It's implemented as a volatile read, so the compiler will never optimize it out. There's no data dependency and it will be removed eventually, but for now there is a cost there.
We then call __rust_alloc
with two arguments. One of those is the size and one is the alignment. We then test for null, load our value, and call the library function identically to the C version.
Let's look at the C++ assembly:
func():
push rax
mov edi, 4
call operator new(unsigned long)
mov dword ptr [rax], 5
mov qword ptr [rsp], rax
mov rdi, rsp
call library_function(std::unique_ptr<int>)
mov rdi, qword ptr [rsp]
test rdi, rdi
je .LBB0_2
call operator delete(void*)
.LBB0_2:
pop rax
ret
Again, I've cleaned up the symbols a little bit.
There isn't an obvious null check, but that's done within new
. Importantly, look at how we setup the call to the library function:
mov qword ptr [rsp], rax
mov rdi, rsp
call library_function(std::unique_ptr<int>)
We're passing the pointer on the stack.
There are fundamental issues with C++ that prevent std::unique_ptr
from being passed by register. There's the Intel Itanium ABI which is used by gcc and Clang which dictates any type with a destructor cannot be passed by register, but there's also more subtle issues in regards to destructor order that I do not fully understand.
Additionally, since lifetimes in C++ always extend to the end of the variable's scope, changing this is almost certain to cause use-after-frees in current codebases.
There was at least one proposal to consider breaking ABI in C++23 with std::unique_ptr
stated as motivation, but it doesn't appear to be implemented. There is also a non-standard Clang extension that allows passing the unique_ptr
by register, but it highlights the potential hazards and miscompilations it can cause. Naturally, according to the extension docs, the solution to the Undefined Behavior this change will cause in "safe" C++ is to run address sanitizer and hope it catches everything.
Rust has a far from ideal ABI, but improving it certainly won't cause UB in safe Rust. After all, all of the runtime failures the Clang extension highlights would've already been caught at compile time in Rust.
Optional Unique Pointers
Continuing on the inadequacies of C++'s std::unique_ptr
, let's consider the case where we have a function that returns an optional owned heap allocation. In C, we typically return a null pointer to represent the failure case.
We'll again compare C++ and Rust with C to assess the cost of the abstraction.
// C++ implementation
#include <memory>
#include <optional>
extern std::optional<std::unique_ptr<int>> library_function() noexcept;
int func() {
auto var = library_function();
// if we forget this check, that's instant Undefined Behavior if var is
// nullopt
if (var) {
return **var;
}
return 0;
}
extern "Rust" {
fn library_function() -> Option<Box<i32>>;
}
#[no_mangle]
pub fn func() -> i32 {
// this is unsafe because the compiler can't verify extern functions
let var = unsafe { library_function() };
if let Some(var) = var {
return *var
}
0
}
#include<stdlib.h>
// returns NULL to represent None
extern int* library_function(void);
int
func(void)
{
int *var = library_function();
if (var) {
int ret = *var;
free(var);
return ret;
}
return 0;
}
Starting with looking at the disassembly for the C program:
func:
push rbx
call library_function
test rax, rax
je .LBB0_1
mov ebx, dword ptr [rax]
mov rdi, rax
call free
mov eax, ebx
pop rbx
ret
.LBB0_1:
xor ebx, ebx
mov eax, ebx
pop rbx
ret
It's pretty much a direct translation of what we wrote. It is worth noting that LLVM missed out on optimization opportunity that I included in the source. If library_function
returns null then eax
is zero, so there's no reason to set it again. I also don't understand why LLVM chooses to zero ebx
and then move it to eax
. I think this is a bug since gcc has far better codegen here.
func:
push rbx
call library_function
test rax, rax
je .LBB0_1
mov ebx, dword ptr [rax]
mov esi, 4
mov edx, 4
mov rdi, rax
call __rust_dealloc
mov eax, ebx
pop rbx
ret
.LBB0_1:
xor ebx, ebx
mov eax, ebx
pop rbx
ret
We see basically the same assembly as the C version, complete with the same missed optimization (but rustc's gcc backend does it). There's a small difference in the free
arguments, but that's it.
Next, let's look at C++'s codegen:
func():
push rbx
sub rsp, 16
mov rdi, rsp
call library_function()
cmp byte ptr [rsp + 8], 0
je .LBB0_1
mov rdi, qword ptr [rsp]
mov ebx, dword ptr [rdi]
mov byte ptr [rsp + 8], 0
call operator delete(void*)
mov eax, ebx
add rsp, 16
pop rbx
ret
.LBB0_1:
xor ebx, ebx
mov eax, ebx
add rsp, 16
pop rbx
ret
Unlike the Rust codegen, this is clearly worse than the C version (i.e. this is not a zero cost abstraction). We see that the library function returns its data on the stack. std::optional
appears to use rsp + 8
as its discriminant instead of using null to represent nullopt
. C++ has no language-level way of taking advantage of this invalid value in std::optional
, and so using the abstraction incurs a runtime cost.
Rust, of course, can do that because of its niche optimizations.
I will continue to update this post with more annoyances when I come across them.