Part 1. Calling C functions - CSCI 241 Systems Programming Labs

Part 1. Calling C functions (5 points)

We saw last time (and will see a little more this time) how Linux uses the procfs virtual file system (which is usually mounted at /proc) to expose information to user processes.

This is not the only mechanism that operating systems use. Every modern OS works by enforcing a strict separation between user processes (the ones we run) and the operating system kernel. This is essential to prevent, among other things, buggy programs from crashing the whole computer. Unfortunately, it comes at a cost. If a user process wants information from the OS, it cannot simply call a function in the kernel. Instead, it makes a system call.

System calls are a mechanism for requesting the OS do something on your behalf. Some examples of system calls in different application domains:

Open a file
Read data from a file
Write data to a file
Get the current time of day
Run a new program (more about this in a later lab)
Terminate a program (we’re going to do this!)
Generate random numbers in a secure fashion
Open a network connection to a server (more about this in a later lab)
Send or receive data over the network to/from the server
Exit the current process

Every program you’ve written has used at least one system call.

The system call mechanism is not “portable” (meaning it’s different for every OS), so most of the time programmers don’t issue system calls themselves. Nevertheless, we can write code like

use std::fs::File;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut file = File::open("example.txt")?;
Ok(())
}

and it can be compiled for any (modern) OS and it’ll open the file.

The way this works is that every OS exposes a set of functions in a library which will make the relevant system calls. For example, Apple calls their library libSystem. Even if the underlying system calls themselves change, as part of an update for example, the OS vendor (e.g., Microsoft or Apple) updates this library to make the updated system calls on the program’s behalf.

However, there is one issue.There are a million different programming languages. Each language can have wildly different ideas about what a function is. Making a different library with a different set of functions for every language would be unmanageable. Instead, the C programming language is the de facto programming language forthese libraries. And indeed, the most common name for the library containingfunctions that make system calls is libc. On an x86-64 Linux system, you’re likely to find the GNU version of libc living at /lib/x86_64-linux-gnu/libc.so.6.

This is really unusual, but you can actually run this libc as if it were a program and it’ll print out some information about the library.

$ /lib/x86_64-linux-gnu/libc.so.6
GNU C Library (Ubuntu GLIBC 2.27-3ubuntu1.6) stable release version 2.27.
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 7.5.0.
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/glibc/+bugs>.

Okay, so if C is the de facto language, how can we make system calls in Rust if the Rust standard library doesn’t already have the functionality we need? Well, we call C functions! Nearly every programming language has some mechanism, often called a Foreign Function Interface (FFI), that lets you call C functions. Rust is no exception.

In Rust, the functions in libc are exposed in the libc crate. Here’s the documentation for x86-64 Linux.

As part of your implementation of ps, you printed out the user ID (UID) of the user that is running the process. You’re going to write some code to turn the UID into a String containing the user’s username.

Your Task

You’re going to implement the most bare-bones version of the standard whoami command-line utility that prints out the user name of the user who runs the process. E.g., here’s the standard whoami on my laptop.

$ whoami
steve

Pretty straightforward. Unfortunately, there’s no Rust standard library function to get this information. We’re going to have to write it ourselves!

First, in your assignment repository, add the libc crate as a dependency by running

$ cargo add libc

Next, create a file src/bin/whoami.rs. By putting the file in bin, $ cargo build will build two different applications and $ cargo run now needs to know which binary to run. Write a simple main function in whoami.rs that prints out a message and exits. You can (build and) run this via

$ cargo run --bin whoami
Hello world!

If you have two or more binaries and you run $ cargo run without specifying which to run, you’ll get an error.

Getting the username for the user who is running the current process requires two steps:

Ask the OS for the UID of the user
Ask the OS for the username associated with the UID.

The C function we need to call for 1 is geteuid(). If you click that link, you’ll find no real documentation. That’s a shame, but not surprising as the libc crate is generated programmatically and it doesn’t include documentation. Fortunately, there are manual pages for C functions. Here’s the man page for geteuid(2). Give it a read. (Note that it is in section 2 of the manual. Section 1 is where user programs like cat are documented. Section 2 is where system calls are documented. Section 3 is where C standard library functions are documented. You can run $ man man for a list of the different sections.)

There are two functions documented, getuid() which returns the real UID (we don’t want this), and geteuid() which returns the effective UID (this is the one we want). These are usually the same. The most common way they can differ is if you run a process as the root user (which has UID 0) via the sudo command. In that case, geteuid() will return 0 and getuid() will return the real user. We want the effective one instead:

$ whoami
steve

$ sudo whoami
root

Important

Every function in libc is marked unsafe. This indicates that the function might do something that violates Rust’s safety guarantees and it’s up to the programmer to ensure that it doesn’t. We can only call an unsafe function within an unsafe function or within an unsafe {} block.

Because of this, it is considered best practice to encapsulate anything unsafe behind a safe interface. It’s up to you to perform error handling.

The geteuid() function is unusual in that it cannot fail. There is no return value that indicates an error condition (I know this because I read the man page, “These functions are always successful…”). As a result, we can trivially make a safe wrapper around geteuid() by calling libc::geteuid() inside an unsafe {} block and returning the result.

extern crate libc;
type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>;
fn geteuid() -> u32 {
    unsafe { libc::geteuid() }
}

fn main() {
    let uid = geteuid();
    println!("The current effective UID is {uid}");
}

There are a handful of libc functions like getpid(), getppid(), and getpgrp() which just return a numeric identifier and can be handled by just wrapping the call in an unsafe {} block. Most system calls are sadly not this easy to deal with. We’ll need to carefully handle memory and any errors that result.

Speaking of other system calls, you now know how to get the effective UID of the user running the process. Neat!

Turning that into a user name is more complex and can fail.