Part 1. Calling C functions (5 points)
We saw last time (and will see a little more this time) how Linux uses the
procfs virtual file system (which is usually mounted at /proc
) to expose
information to user processes.
This is not the only mechanism that operating systems use. Every modern OS works by enforcing a strict separation between user processes (the ones we run) and the operating system kernel. This is essential to prevent, among other things, buggy programs from crashing the whole computer. Unfortunately, it comes at a cost. If a user process wants information from the OS, it cannot simply call a function in the kernel. Instead, it makes a system call.
System calls are a mechanism for requesting the OS do something on your behalf. Some examples of system calls in different application domains:
- Open a file
- Read data from a file
- Write data to a file
- Get the current time of day
- Run a new program (more about this in a later lab)
- Terminate a program (we’re going to do this!)
- Generate random numbers in a secure fashion
- Open a network connection to a server (more about this in a later lab)
- Send or receive data over the network to/from the server
- Exit the current process
Every program you’ve written, has used at least one system call.
Most of the time, programmers do not issue system calls themselves. The reason for this is that the system call mechanism is not portable, meaning it’s different on every OS. Nevertheless, we can write code like
use std::fs::File; fn main() -> Result<(), Box<dyn std::error::Error>> { let mut file = File::open("example.txt")?; Ok(()) }
and it can be compiled for any (modern) OS and it’ll open the file.
The way this works is that every OS exposes a set of functions in a library
which will make the relevant system calls. For example, Apple calls their
library libSystem
. Even if the underlying system calls themselves change, as
part of an update for example, the OS vendor (e.g., Micorsoft or Apple)
updates this library to make the updated system calls on the program’s behalf.
Now we hit a bit of a snag. There are a million programming languages. Each
language can have wildly different ideas about what a function is. One way to
deal with this would be to have a library per programming language. E.g., we
could have a libSystem-Python
, libSystem-Java
, libSystem-Rust
, etc. This
quickly becomes unworkable.
Instead, the C programming language is the de facto programming language for
these libraries. And indeed, the most common name for the library containing
functions that make system calls is libc
. On an x86-64 Linux system, you’re
likely to find the GNU version of libc
living at
/lib/x86_64-linux-gnu/libc.so.6
.
This is really unusual, but you can actually run this libc
as if it were a
program and it’ll print out some information about the library. Almost no
library does this.
$ /lib/x86_64-linux-gnu/libc.so.6
GNU C Library (Ubuntu GLIBC 2.27-3ubuntu1.6) stable release version 2.27.
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 7.5.0.
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/glibc/+bugs>.
Okay, so if C is the de facto language, how can we make system calls in Rust if the Rust standard library doesn’t already have the functionality we need? Well, we call C functions! Nearly every programming language has some mechanism, often called a Foreign Function Interface (FFI), that lets you call C functions. Rust is no exception.
In Rust, the functions in libc
are exposed in the libc
crate. Here’s the
documentation
for x86-64 Linux.
As part of your implementation of ps
, you printed out the user id (UID) of
the user that is running the process. You’re going to write some code to turn
the UID into a String
containing the user’s username.
Your Task
You’re going to implement the most bare-bones version of the standard whoami
command-line utility that prints out the user name of the user who runs the
process. E.g., here’s the standard whoami
on my laptop.
$ whoami
steve
Pretty straight forward. Unfortunately, there’s no Rust standard library function to get this information. We’re going to have to write it ourselves!
First, add the libc
crate as a dependency by running
$ cargo add libc
in your assignment repository.
Next, create a file src/bin/whoami.rs
. By putting the file in bin
, $ cargo build
will build two different applications and $ cargo run
now needs
to know which binary to run. Write a simple main
function in whoami.rs
that prints out a message and exits. You can (build and) run this via
$ cargo run --bin whoami
Hello world!
If you have two or more binaries and you run $ cargo run
without specifying
which to run, you’ll get an error.
Getting the user name for the user who is running the current process requires two steps:
- Ask the OS for the UID of the user
- Ask the OS for the username associated with the UID.
The C function we need to call for 1 is
geteuid().
If you click that link, you’ll find no real documentation. That’s a shame, but
not surprising as the libc
crate is generated programatically and it doesn’t
include documentation. Fortunately, there are manual pages for C functions.
Here’s the man page for
geteuid(2)
. Give
it a read. (Note that it is in section 2 of the manual. Section 1 is where user
programs like cat
are documented. Section 2 is where system calls are
documented. Section 3 is where C standard library functions are documented.
Run $ man man
for a list of the different sections.)
There are two functions documented, getuid()
which returns the real UID (we
don’t want this), and geteuid()
which returns the effective UID (this is the
one we want). These are usually the same. The most common way they can differ
is if you run a process as the root user (which has UID 0) via the sudo
command. In that case, geteuid()
will return 0
and getuid()
will return
the real user. We want the effective one instead:
$ whoami
steve
$ sudo whoami
root
Every function in libc
is marked unsafe
. This indicates that the function
might do something that violates Rust’s safety guarantees and it’s up to the
programmer to ensure that it doesn’t. We can only call an unsafe
function
within an unsafe
function or within an unsafe {}
block.
Because of this, it is considered best practice to encapsulate anything unsafe behind a safe interface. It’s up to you to perform error handling.
The geteuid()
function is unusual in that it cannot fail. There is no return
value that indicates an error condition (I know this because I read the man
page, “These functions are always successful…”). As a result, we can
trivially make a safe wrapper around geteuid
by calling libc::geteuid()
inside an unsafe {}
block and returning the result.
extern crate libc; type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>; fn geteuid() -> u32 { unsafe { libc::geteuid() } } fn main() { let uid = geteuid(); println!("The current effective UID is {uid}"); }
There are a handful of libc
functions like getpid()
, getppid()
, and
getpgrp()
which just return an numeric identifier and can be handled by just
wrapping the call in an unsafe {}
block. Most system calls are sadly not
this easy to deal with. We’ll need to carefully handle memory and any errors
that result.
Speaking of other system calls, you now know how to get the effective UID of the user running the process. Neat!
Turning that into a user name is more complex and can fail.
Create a new file src/user.rs
.