Part 3. Handling signals (30 points)
When the user types a Ctrl-C, the kernel sends a
signal to every process in the foreground process group of the controlling
terminal (review this info box for details
on process groups and controlling terminals). By default, this signal—its
name is SIGINT
and it is the interrupt signal—causes the process that
receives it to terminate.
There are other signals that can be sent in response keys being pressed.
SIGINT
is by far the most common signal sent by a key press, but
Ctrl-\ sends SIGQUIT
, the quit signal. (Go ahead, try
running a program like $ sleep 10
that is long running and kill it by typing
Ctrl-C or Ctrl-\.) Note this is
different from Ctrl-D which closes stdin but doesn’t
terminate processes.
A process can choose to handle a signal by registering a signal handler function which will be called in the event of a signal. Or, a process can choose to ignore a signal if it doesn’t want the default action—often termination—to occur.
Shells need to deal with SIGINT
and SIGQUIT
to avoid being terminated.
When Bash receives a SIGINT
, it stops reading its current line of input,
prints a newline and a prompt and waits for the next line of input. When Bash
receives a SIGQUIT
, it ignores it.
A process can choose to handle or ignore a signal using the
sigaction(2) system call (by calling the libc wrapper function as
usual). The first argument is the signal number. The second argument is a
pointer to a sigaction
structure (yes, the struct and the function have the
same name) that controls the action that should be taken when that signal is
received. The third argument is a pointer to a sigaction
struct that will be
filled out by sigaction()
with the previous action.
These are the Rust definitions for the libc::sigaction() function and the libc::sigaction.
To ignore a particular signal, you create a libc::sigaction
struct with its
sa_sigaction
field set to libc::SIG_IGN
and pass it to
libc::sigaction()
.
When this example program is run, SIGINT
is ignored and then the process
goes to sleep for 10 seconds. During this time, pressing
Ctrl-C has no effect.
use std::io;
fn main() -> io::Result<()> {
unsafe {
let action = libc::sigaction {
sa_sigaction: libc::SIG_IGN,
..std::mem::zeroed()
};
if libc::sigaction(libc::SIGINT, &action, std::ptr::null_mut()) < 0 {
return Err(io::Error::last_os_error());
}
}
std::thread::sleep(std::time::Duration::from_secs(10));
Ok(())
}
One thing to notice is that action
was constructed using ..
to fill the rest of the fields other than sa_action
. See the Book for details about ..
. This is because different OSes use different structures with different members. We only need to use the sa_sigaction
field which appears in all of the different sigaction
structs. Essentially, doing it this way makes the code slightly more portable. For example, it works on both macOS and Linux.
The std::mem::zeroed()
is creating a libc::sigaction
struct where all of the fields have been set to 0. In general, doing this is unsafe, hence the reason it is in the unsafe
block. It’s safe in this particular case because all of the fields may be safely set to 0.
To install a signal handler, you set the sa_sigaction
field to the address
of a function.
When a signal is received for which a signal handler has been registered, the kernel will run the handler. But the question is when does it run it? The answer is it’s a little unpredictable. It will be run the next time the kernel returns control to the process. This can happen as the result of returning from a system call or simply because the kernel has scheduled the process as the next process to run.
Since signal handler functions can be run at any time, the actions a signal
handler can take are extremely limited. There are a very small number of
functions that are safe to be called from a signal handler. In particular, a
signal handler is not allowed to allocate memory or use perform I/O using
things like println!()
. Instead, about all a signal handler should do is set
flag which is checked outside the signal handler to see if the signal
occurred.
Setting the flag and checking if the flag has been set must happen
atomically, meaning without being interrupted. You will be unsurprised to
learn that Rust has great support for atomic variables. For the flag, you want
to use an AtomicBool
. This will have values true
or false
, but we assign
values to them using the .store()
method and load values using the .load()
methods. The .swap()
method is used to atomically load the old value and
assign a new value.
This example shows how to register a signal handler for SIGINT
.
use std::io;
use std::sync::atomic::{AtomicBool, Ordering};
static INTERRUPTED: AtomicBool = AtomicBool::new(false);
extern "C" fn handler(_sig: libc::c_int) {
INTERRUPTED.store(true, Ordering::Relaxed);
}
fn main() -> io::Result<()> {
unsafe {
let action = libc::sigaction {
sa_sigaction: handler as libc::sighandler_t,
..std::mem::zeroed()
};
if libc::sigaction(libc::SIGINT, &action, std::ptr::null_mut()) < 0 {
return Err(io::Error::last_os_error());
}
}
std::thread::sleep(std::time::Duration::from_secs(10));
if INTERRUPTED.load(Ordering::Relaxed) {
println!("Received a SIGINT");
}
Ok(())
}
Some key points to notice: The handler
function is declared as extern "C"
because the handler function is expected to be a C function. INTERRUPTED
is our first example of a global variable in Rust. The handler
function is
incredibly simple: All it does is set a flag.
If you run the code in that example and press Ctrl-C
during the sleep, you’ll notice something disconcerting: The process sleeps
for the full 10 seconds and then afterward it prints out Received a SIGINT
.
The whole point of the interrupt signal is to interrupt a process so what is
going on here?
Some system calls are allowed to be interrupted by signals. When this happens,
the C wrapper around the underlying system call sets the thread-local variable
errno
to EINTR
and returns -1
. If you examine the source code for the
sleep()
function, you’ll see it calls the underlying
libc::nanosleep()
function in a loop if it is awakened by a signal. (Note
the if
on line 242 checking if the return value of libc::nanosleep()
is
-1
and then asserting that the value of os::errno()
is libc::EINTR
.)
If we replace the call to sleep()
with a call to libc::nanosleep()
, we can
see the difference.
In this example, the call to nanosleep()
is explicit and since it is not called
in a loop, interrupting via SIGINT
works as expected.
use std::io;
use std::sync::atomic::{AtomicBool, Ordering};
static INTERRUPTED: AtomicBool = AtomicBool::new(false);
extern "C" fn handler(_sig: libc::c_int) {
INTERRUPTED.store(true, Ordering::Relaxed);
}
fn main() -> io::Result<()> {
unsafe {
let action = libc::sigaction {
sa_sigaction: handler as libc::sighandler_t,
..std::mem::zeroed()
};
if libc::sigaction(libc::SIGINT, &action, std::ptr::null_mut()) < 0 {
return Err(io::Error::last_os_error());
}
}
let ts = libc::timespec {
tv_sec: 10,
tv_nsec: 0,
};
if unsafe { libc::nanosleep(&ts, std::ptr::null_mut()) } < 0 {
let err = io::Error::last_os_error();
if err.kind() != io::ErrorKind::Interrupted {
return Err(err);
}
}
if INTERRUPTED.load(Ordering::Relaxed) {
println!("Received a SIGINT");
}
Ok(())
}
Note how a io::Error
is constructed from errno
by calling io::Error::last_os_error()
and then returned if it isn’t io::ErrorKind::Interrupted
.
Many Rust standard library functions, such as the read_line()
method, will
perform this same thing where if a signal interrupts a system call, the
standard library function will make the call again. Usually, that’s the
behavior you want. For a shell, it isn’t. And when it isn’t, you have to
implement the functionality you want which sometimes means calling the
underlying libc functions.
Your task
Osh should work slightly differently than Bash. In particular, if Osh receives
either a SIGINT
or SIGQUIT
signal, it should ignore the current line of
input that is being typed, print a newline, and then prompt for the next line
of input.
Run cargo add libc
to add the libc crate. Then, create a new module named signals
. Inside it write a function
install_signal_handlers() -> io::Result<()>
that sets the signal handler for
both signals to call a function named handler
. Note that you can register the same handler for mulitple signals. Make sure any errors from
libc::sigaction()
are returned.
Write the handler
function
extern "C" fn handler(_sig: libc::c_int) {
todo!()
}
that stores true
in an AtomicBool
following the examples above.
Write a was_interrupted() -> bool
function that atomically loads the value
of the AtomicBool
and stores false
. Use the .swap()
method for this
purpose. Return the loaded value. In this way, was_interrupted()
returns
true
if a SIGINT
or a SIGQUIT
was received since the last time was_interrupted()
was
called.
Add a call to install_signal_handlers()
to the beginning of main()
.
As mentioned above, the .read_line()
method explicitly ignores interrupts
from receiving a signal. Since Osh needs a way to read a line of input from
stdin but also return immediately when a SIGINT
or SIGQUIT
is received,
you cannot use the standard library methods.
Fortunately, it’s easy to look at the Rust source code and modify it to suit
your needs. So I did. Here’s a function you can use which reads from stdin and
returns a io::Result<String>
.
#![allow(unused)] fn main() { /// This is similar to std::io::stdin().read_line() except that /// being interrupted by a signal returns an Err(err). /// This implementation comes from modifying /// https://doc.rust-lang.org/src/std/io/mod.rs.html#1940 fn read_line() -> std::io::Result<String> { use std::io::BufRead; let mut stdin = std::io::stdin().lock(); let mut buf = Vec::new(); loop { let (done, used) = { let available = stdin.fill_buf()?; match available.iter().position(|&b| b == b'\n') { Some(i) => { buf.extend_from_slice(&available[..=i]); (true, i + 1) } None => { buf.extend_from_slice(available); (false, available.len()) } } }; stdin.consume(used); if done || used == 0 { let s = String::from_utf8(buf) .map_err(|err| std::io::Error::new(std::io::ErrorKind::InvalidData, err))?; return Ok(s); } } } }
The key point is that if stdin.fill_buf()
returns an error because it was
interrupted by a signal, then read_line()
will return that error. You can
check if the returned error was due to being interrupted by a signal by using
err.kind() == std::io::ErrorKind::Interrupted
.
You will need to refactor your run function. Move your code that prints a prompt and reads from stdin into a loop that calls signals::was_interrupted()
and if a signal has been received
prints a newline. Then print the prompt and call that read_line()
function.
If read_line()
returns an error and the error’s kind is Interrupted
, then
continue the loop. If it returns some other error, return it. Otherwise break
out of the loop and process the line as before.
Here’s some sample output.
$ cargo run --quiet
Welcome to the Oberlin Shell!
$ this line is interrupted^C
$ this one is quit^\
$ echo this one is not
this one is not
$ sleep 10
^C
$
If your shell can now handle redirections and signals, you’re done! Congratulations, this was a lot of work.