Gavin's Site

Whenever Rust is used for systems software, there is always an argument that goes along the lines of "there is so much unsafe everywhere, what even is the point of using Rust?"

This of course, implies that the alternative, C++, is a good and sane language that we should continue to use.

Take this example of some C++ code. There is no scary raw pointers or anything here. Since C++ is a good and sane language, we would expect this code to simply loop forever.

#include <stdio.h>


[[gnu::noinline]]
void
ub(void)
{
	asm volatile ("" : : ); // so the call isn't optimized away
	while (1) { }
}

void
unreachable(void)
{
	puts("oh no!");
}

int 
main(void)
{
	ub();
	return 0;
}

However, when we compile it with Clang++ 14.0.6 and run the output:

$ ./ub
oh no!
Segmentation fault

Naturally, we should've known, that halting in an infinite loop like that of Undefined Behavior. This is even well-defined in C, but C++ introduces this safety regression.

It's not ideal that systems libraries have lots of unsafe everywhere, but that is the reality of a lot of systems programming. I would rather have clearly signposted sketchy parts of the program rather than the whole program be the sketchy part of the program.

It is certainly the case that Modern C++ is safer than older C++, but it still has some sketchy bits. Creating objects, destroying objects, indexing vectors, concurrency, popping from queues, iterators, loops, std::optional, and the regular addition operator all can trivially cause Undefined Behavior if used slightly incorrectly.

Optimizations

C++ and Rust share the principle of Zero-Cost Abstractions. Depending on who you ask, this means at least one of the following.

  1. Not using an abstraction does not impose any runtime cost
  2. Using an abstraction is just as fast as implementing it by hand

I believe both languages do a pretty good job of meeting the first goal (we'll get to some failures later), but the second is where Rust's semantics provide a notable advantage.

std::unique_ptr vs Box

RAII enables resource reclamation when a binding goes out of scope. The most obvious way to do this is with a heap allocation that is automatically freed at the end of its lifetime.

Let's look at an example of this in C++ and Rust, with a C implementation for good measure.

// C++ implementation
#include <memory>
#include <utility>

extern void library_function(std::unique_ptr<int>) noexcept;

void func() {
	auto var = std::make_unique<int>(5);
	library_function(std::move(var));
}
// Rust implementation
extern "Rust" {
	fn library_function(x: Box<i32>);
}


#[no_mangle]
pub fn func() {
	let var = Box::new(5);
	// this is unsafe because the compiler can't verify extern functions
	unsafe { library_function(var) };
}
// C implementation
#include<stdlib.h>

extern void library_function(int *);

void 
func(void)
{
	int *var = malloc(sizeof(int));
	if (!var) {
		abort();
	}
	*var = 5;
	library_function(var);
}

These functions each create an owned heap allocation and call an external library function. Since this is as simple as it gets, we'd expect effectively identical assembly. I compiled each of these with the latest stable LLVM-based compiler on godbolt.org: Rustc 1.77.0 for Rust and Clang 18.1.0 for C and C++.

Let's start with the assembly for the C implementation to establish our expectations.

func:
	push    rax
	mov     edi, 4
	call    malloc
	test    rax, rax
	je      .LBB0_1
	mov     dword ptr [rax], 5
	mov     rdi, rax
	pop     rax
	jmp     library_function
.LBB0_1:
	call    abort

This looks exactly what we would expect. We call malloc, test for failure, load our value, and call our library function.

Next, let's look at the Rust assembly.

func:
		push    rax
		movzx   eax, byte ptr [rip + __rust_no_alloc_shim_is_unstable]
		mov     edi, 4
		mov     esi, 4
		call    __rust_alloc
		test    rax, rax
		je      .LBB0_1
		mov     dword ptr [rax], 5
		mov     rdi, rax
		pop     rax
		jmp     library_function
.LBB0_1:
		mov     edi, 4
		mov     esi, 4
		call    alloc::handle_alloc_error

I've cleaned up some symbol names for clarity, but there still seems like there's a little bit more here.

First, there's this load from __rust_no_alloc_shim_is_unstable. This exists to force the linker to error if you try to use the corresponding unstable feature. It's implemented as a volatile read, so the compiler will never optimize it out. There's no data dependency and it will be removed eventually, but for now there is a cost there.

We then call __rust_alloc with two arguments. One of those is the size and one is the alignment. We then test for null, load our value, and call the library function identically to the C version.

Let's look at the C++ assembly:

func():
		push    rax
		mov     edi, 4
		call    operator new(unsigned long)
		mov     dword ptr [rax], 5
		mov     qword ptr [rsp], rax
		mov     rdi, rsp
		call    library_function(std::unique_ptr<int>)
		mov     rdi, qword ptr [rsp]
		test    rdi, rdi
		je      .LBB0_2
		call    operator delete(void*)
.LBB0_2:
		pop     rax
		ret

Again, I've cleaned up the symbols a little bit.

There isn't an obvious null check, but that's done within new. Importantly, look at how we setup the call to the library function:

		mov     qword ptr [rsp], rax
		mov     rdi, rsp
		call    library_function(std::unique_ptr<int>)

We're passing the pointer on the stack.

There are fundamental issues with C++ that prevent std::unique_ptr from being passed by register. There's the Intel Itanium ABI which is used by gcc and Clang which dictates any type with a destructor cannot be passed by register, but there's also more subtle issues in regards to destructor order that I do not fully understand.

Additionally, since lifetimes in C++ always extend to the end of the variable's scope, changing this is almost certain to cause use-after-frees in current codebases.

There was at least one proposal to consider breaking ABI in C++23 with std::unique_ptr stated as motivation, but it doesn't appear to be implemented. There is also a non-standard Clang extension that allows passing the unique_ptr by register, but it highlights the potential hazards and miscompilations it can cause. Naturally, according to the extension docs, the solution to the Undefined Behavior this change will cause in "safe" C++ is to run address sanitizer and hope it catches everything.

Rust has a far from ideal ABI, but improving it certainly won't cause UB in safe Rust. After all, all of the runtime failures the Clang extension highlights would've already been caught at compile time in Rust.

Optional Unique Pointers

Continuing on the inadequacies of C++'s std::unique_ptr, let's consider the case where we have a function that returns an optional owned heap allocation. In C, we typically return a null pointer to represent the failure case.

We'll again compare C++ and Rust with C to assess the cost of the abstraction.

// C++ implementation
#include <memory>
#include <optional>

extern std::optional<std::unique_ptr<int>> library_function() noexcept;

int func() {
	auto var = library_function();
	// if we forget this check, that's instant Undefined Behavior if var is
	// nullopt
	if (var) {
		return **var;
	}
	return 0;
}
extern "Rust" {
	fn library_function() -> Option<Box<i32>>;
}


#[no_mangle]
pub fn func() -> i32 {
	// this is unsafe because the compiler can't verify extern functions
	let var = unsafe { library_function() };
	if let Some(var) = var {
		return *var
	}
	0
}
#include<stdlib.h>

// returns NULL to represent None
extern int* library_function(void);

int 
func(void)
{
	int *var = library_function();
	if (var) {
		int ret = *var;
		free(var);
		return ret;
	}
	return 0;
}

Starting with looking at the disassembly for the C program:

func:
	push    rbx
	call    library_function
	test    rax, rax
	je      .LBB0_1
	mov     ebx, dword ptr [rax]
	mov     rdi, rax
	call    free
	mov     eax, ebx
	pop     rbx
	ret
.LBB0_1:
	xor     ebx, ebx
	mov     eax, ebx
	pop     rbx
	ret

It's pretty much a direct translation of what we wrote. It is worth noting that LLVM missed out on optimization opportunity that I included in the source. If library_function returns null then eax is zero, so there's no reason to set it again. I also don't understand why LLVM chooses to zero ebx and then move it to eax. I think this is a bug since gcc has far better codegen here.

func:
	push    rbx
	call    library_function
	test    rax, rax
	je      .LBB0_1
	mov     ebx, dword ptr [rax]
	mov     esi, 4
	mov     edx, 4
	mov     rdi, rax
	call    __rust_dealloc
	mov     eax, ebx
	pop     rbx
	ret
.LBB0_1:
	xor     ebx, ebx
	mov     eax, ebx
	pop     rbx
	ret

We see basically the same assembly as the C version, complete with the same missed optimization (but rustc's gcc backend does it). There's a small difference in the free arguments, but that's it.

Next, let's look at C++'s codegen:

func():
	push    rbx
	sub     rsp, 16
	mov     rdi, rsp
	call    library_function()
	cmp     byte ptr [rsp + 8], 0
	je      .LBB0_1
	mov     rdi, qword ptr [rsp]
	mov     ebx, dword ptr [rdi]
	mov     byte ptr [rsp + 8], 0
	call    operator delete(void*)
	mov     eax, ebx
	add     rsp, 16
	pop     rbx
	ret
.LBB0_1:
	xor     ebx, ebx
	mov     eax, ebx
	add     rsp, 16
	pop     rbx
	ret

Unlike the Rust codegen, this is clearly worse than the C version (i.e. this is not a zero cost abstraction). We see that the library function returns its data on the stack. std::optional appears to use rsp + 8 as its discriminant instead of using null to represent nullopt. C++ has no language-level way of taking advantage of this invalid value in std::optional, and so using the abstraction incurs a runtime cost.

Rust, of course, can do that because of its niche optimizations.

I will continue to update this post with more annoyances when I come across them.