Discussion: New `unchecked` keyword
A brief note
Rust is a language that aims to empower everyone to build reliable and efficient software. In order to
do so, several things are built into the language’s semantics, saliently, lifetimes and usage of the
unsafe
keyword. To put it in one stroke, through lifetime subtyping and variance the compiler ensures
that when a safe reference (by using &ident
) is created, it is valid throughout it’s usage (in that
region where the lifetime is valid).
π¦ Hedgy: Wait, did you just say that the `unsafe` keyword is a significant part of building reliable software with Rust? Now you're just contradicting yourself :/
π¦π» Sayan: Hold on, I'm not done yet!
The unsafe
keyword marks regions of code where the person writing the code has to uphold the same
guarantees that the compiler would otherwise guarantee. But how does this help in writing safe code?
Well, having unsafe
code blocks indicates that this specific section of code is gnarly and if something’s
going wrong, then you should first check if this block is doing the right thing…or not.
π¦ Hedgy: Ah! Now I see.
In this post, I’d like to discuss about another check that can help ensure the correctness of programs.
And this is none but the unchecked
keyword. Do note that Rust’s goal is to ensure memory safety: leaks
are fine, and so is incorrect logic – because only a flavor of the Skynet can ensure that your program is
logically correct by determining your intentions ;)
Motivation
Let’s say we’re in a fictious world where finanical institutions have more crabs π¦ in their offices than
cups of coffee β. Now, this bank has a library that is used by the bank’s developers to program systems
that interact with the bank. Also, these libraries have an extreme level of access to the bank’s
transactions. Now, the person who created a library has an Overdraft
type. This is declared as follows:
struct Overdraft {
available: u64,
limit: u64,
}
Apart from all the other associated functions, we’ll look at the withdraw
and
withdraw_unchecked
functions:
impl Overdraft {
/// Check the available balance and limit, only withdrawing when they are fine
pub fn withdraw(&mut self, howmuch: u64) -> Result<(), ()> {
if self.available >= howmuch && self.limit <= howmuch {
// some very long code block of internal bank
// stuff that we won't bother with it
self.withdraw_unchecked(howmuch);
Ok(())
} else {
Err(())
}
}
/// Use this when the withdrawl is emergency and is authorized by
/// bank stakeholders
pub fn withdraw_unchecked(&mut self, howmuch: u64) {
// some very long code block of internal bank
// stuff that we won't bother with it
self.available -= howmuch;
}
}
The withdraw
function is completely fine: it checks the limit and available balance before
withdrawl while the withdraw_unchecked
function doesn’t check those and simply withdraws
money. Now, in reality the withdraw
function must have several more things going on than
a simple conditional, think stuff like: calculating credit score, asking some intermediary
organization or provider et cetera. This means that the withdraw
function is actually
pretty expensive to call. That’s why the developers of the library provided the
withdraw_unchecked
function that enables the bank’s developers to skip those expensive
checks and immediately allow withdrawl.
What can possibly go wrong? Well, in a rush one of the bank’s developers calls withdraw_unchecked
where withdraw
should’ve been called; in that case, the account holder might be allowed to
borrow far more than the overdraft allows! Now, this is a logic error (and a terrible one
and the bank can definitely go mad over the dev who wrote the program). For whatever reason
(a broken test suite, deadlines, et al), this made it to the production channel, ….
and the bank lost a lot of money.
The bank’s CEO called the CTO and told them to tell the responsible devs to make sure that this never
happens again. Now, the CTO communicates this over to the devs who maintain the “dangerous library” that
is used to make low level changes to the transactions of the bank. Now, they wanted to do something about
this. They thought they could improve the docs and write in HUGE BLOCK LETTERS printed all over the
workspace that the *_unchecked
set of functions are to be only called responsibly. But well, they soon
realized that people might still end up calling the library’s privileged functions and this was
to be prevented, right at the library level. So, how could they do it?
With Rust today, the solution that these devs could’ve adopted would be to mark the function unsafe
.
In order to use this function then, the users of the library must put things into an unsafe
block
and that will definitely remind them that they’re doing something wrong. However, that is not
the intended use of the unsafe
keyword. The Rust std documentation notes the use of unsafe
as:
Code or interfaces whose memory safety cannot be verified by the type system.
The unsafe keyword has two uses: to declare the existence of contracts the compiler canβt check (unsafe fn and unsafe trait), and to declare that a programmer has checked that these contracts have been upheld (unsafe {} and unsafe impl, but also unsafe fn β see below). They are not mutually exclusive, as can be seen in unsafe fn.1
As you can see, the clear intention of the unsafe
keyword is to demarcate regions where memory safety
cannot be guaranteed by the compiler. However, in the above scenario, simply substracting a value would not cause memory
unsafety; at the worst, it could cause an arithmetic underflow (panic in debug mode and wrap around in release) which however
doesn’t introduce any sort of memory unsafety2.
So, what’s the possible language way to ensure correctness? That’s where I’d like to propose an
unchecked
keyword.
π¦ Hedgy: And, the worst example of the year 2022 award goes to Sayan
π¦π» Sayan: Whatever! You get the point, right?
π¦ Hedgy: Yeah, you're trying to reduce logic errors.
π¦π» Sayan: Right!
But the unchecked
keyword does more than attempting to reduce logic errors; it also reduces the
ambiguity3 surrounding unsafe
functions. Today, when an external user of the library finds an unsafe
function β is it an invariant that they have to upkeep to ensure memory safety, or is it one that
they have to upkeep to ensure logical correctness? The only way to be informed about this is to rely on
the documentation that the crate provides or as a last resort, check the implementation. With unchecked
it is immediately clear that the function call won’t invalidate any memory safety contracts, but might
cause correctness errors which do not lead to memory unsafety.
Unchecked functions
An unchecked
function is one that doesn’t cause any memory unsafety, but however, it can cause logical
inaccuracies. It is declared just like an unsafe function:
unchecked fn withdraw_unchecked() {
// ... something silly ...
}
where you have unchecked
in the function signature. To call an unchecked function, I propose two
solutions:
- Use
unchecked
code blocks:unchecked { silly_a(); silly_b(); }
- Use
unchecked
before the function call:unchecked silly_a(); unchecked silly_b();
unchecked and unsafe overlap
Another important note, for a set A
of unchecked
function calls/definitions and set B
of unsafe
function calls/definitions, A β© B = Γ, that is there is no overlap between unchecked
and unsafe
functions. What does this mean?
Let’s say that you have the following functions:
unsafe fn bad_a() {}
unchecked fn silly_b() {}
If you decide to run the below, it will cause an error:
unsafe {
bad_a(); // this won't error
silly_a(); // but this will
}
The converse is also true:
unchecked {
silly_a(); // this won't error
bad_a(); // this will error
}
The correct way to call them would be:
unsafe {
bad_a();
}
unchecked {
silly_a();
}
Now you’re going to say – the developer needs to responsibly write software. Well, I’m going to quote Esteban here:
There are no bad programmers, only insufficiently advanced compilers – Esteban KΓΌber4
Being someone in the Rust community, I don’t think you’re going to disagree :D
Alternatives
A possible alternative for someone implementing and using an unchecked function (not possible with
a library) is by using a macro. The macro, say called unchecked
would be used to declare unchecked
functions like below:
unchecked! {
pub fn silly() -> &'static str{
"You silly"
}
}
And be called like:
let silly = unchecked_call!(silly);
The macro simply concatenates the unchecked_
prefix to the function name so that just calling silly
won’t work, but only calling unchecked_silly
would. This is a very limited workaround because rustdoc
will anyways output the name unchecked_silly
in the generated documentation, defeating the purpose
of having it in the first place. This was also suggested by an user5.
How we should look at unsafe and unchecked
Finally, I’d like to exactly define what each keyword means:
unsafe
: Any function marked asunsafe
informs the caller that the function will cause memory unsafety if the mentioned invariants (preferably in the## Safety
part in rustdoc) are not upheldunchecked
: Any function marked asunchecked
informs the caller that the function does not cause memory unsafety but if the mentioned invariants (preferably in a## Correctness
part in rustdoc) are not upheld, then it may cause logic errors
I hope this clears the ambiguity between unsafe
and unchecked
. Now one might argue, “but logic errors
can happen anywhere.” Sure, they can. But here, you are asserting by using the unchecked
keyword that
you’re the one upholding the invariants required for correctness when calling this method.
Further discussion
This post is intended to be a starting discussion, but there’s a lot more that can be added here. For
example, what about unchecked Trait
s and unchecked impl
s, for example? Similarly what about
unchecked unsafe fn
s? An unsafe
function that is also unchecked
? These are some unresolved
questions that I can think of. Also, is this worth adding to the language as it might only add
to additional language complexity?
As some users have pointed out67, a clippy lint or something similar for unchecked
functions to
describe the exact kind of correctness invariant that has to be ensured by the library user might also be
a good idea.
I’d love to know what you think. If you’re on a social site where you saw this post, consider dropping a reply to the post or shoot me a DM on Twitter if you’d like to attack me personally ;)
https://rust-lang.github.io/rfcs/0560-integer-overflow.html ↩︎
https://www.reddit.com/r/rust/comments/t3di49/comment/hyrrgls ↩︎
https://www.reddit.com/r/rust/comments/t3di49/comment/hyrye8z ↩︎
https://www.reddit.com/r/rust/comments/t3di49/comment/hyroh6e ↩︎
https://www.reddit.com/r/rust/comments/t3di49/comment/hyrrfk7 ↩︎