Part 2. Head - CSCI 241 Systems Programming Labs

Part 2. Head (10 points)

In the next two parts of the lab, you’re going to implement the head command line utility in Rust. This will be divided into two tasks: command line argument parsing (this part) and the implementation of head itself (Part 3).

Before you begin, read the man page for head. While you’re reading it, pay attention to the command line arguments and where head gets its input (from files vs. from stdin) in which cases.

There are many versions of head. You’ll be implementing the FreeBSD version which is the version described in the man page above. The GNU project has another version which is used in most Linux distributions. In general, the GNU versions of command-line utilities tend to have more options. If you run $ man head on the lab machines, you’ll get a manual for the GNU version of head.

We’re going to use the Command Line Argument Parsing library, `clap`` to handle command-line argument parsing for us. As we’ll see, Rust makes supporting command-line arguments, including a nicely formatted help message, very easy.

Example

Let’s look at an example.

extern crate clap;
use std::path::PathBuf;

use clap::Parser;

#[derive(Parser, Debug)]
#[command(author, version, about = "Example command line parsing", long_about = None)]
struct Config {
    /// Use colored output
    #[arg(short, long)]
    color: bool,

    /// Use at most COUNT values
    #[arg(short = 'C', long)]
    count: Option<i32>,

    /// Specify optional log file
    #[arg(short = 'f', long, conflicts_with = "count")]
    log_file: Option<PathBuf>,

    /// Input files
    #[arg(default_value = "-")]
    files: Vec<PathBuf>,
}

fn main() {
    let config = Config::parse();
    println!("{config:#?}");
}

There’s a lot going on here! At the top of the code, there are two use statements which inform the compiler about PathBuf from the standard library and Parser from clap.

Next we have a struct named Config. This structure defines the supported command line arguments of this example program. The main function calls the function Config::parse() which parses all of the command line arguments and crucially, returns an instance of Config with the structure’s members. We can access the members as config.color, config.count, config.log_file, and config.files.

The println!("{config:#?}") line will pretty-print the debug representation of config. The format specifier {config:?} (with the colon followed by a question mark) means to print the debug representation and the # in {config:#?} means to pretty-print it with new lines.

All good command line utilities have a --help option or similar. Before we look at this code in detail, here’s the output of running this program with the --help option which just prints out the help message and then exits the program.

$ cargo run -- --help
Example command line parsing

Usage: ex [OPTIONS] [FILES]...

Arguments:
  [FILES]...  Input files [default: -]

Options:
  -c, --color                Use colored output
  -C, --count <COUNT>        Use at most COUNT values
  -f, --log-file <LOG_FILE>  Specify optional log file
  -h, --help                 Print help
  -V, --version              Print version

This code defines three options, each of which has a short and a long form. Two of the options take arguments and two of the options conflict with one another meaning at most one of those options can be specified at a time. It also takes a list of file paths. Each file path is represented by a PathBuf in Rust.

Because the final line of code prints out the Config instance that was returned from Config::parse(), we can run the program a few times with different options to see the results. Note that in what follows I’m passing the --quiet option to cargo run because I don’t want cargo to print out any output like “Created binary …” As usual, the argument before the -- go to cargo run, the arguments after the -- go to the program itself.

$ cargo run --quiet --
Config {
    color: false,
    count: None,
    log_file: None,
    files: [
        "-",
    ],
}

$ cargo run --quiet -- --color -f log.txt input1.txt input2.txt
Config {
    color: true,
    count: None,
    log_file: Some(
        "log.txt",
    ),
    files: [
        "input1.txt",
        "input2.txt",
    ],
}

$ cargo run --quiet -- --version  
ex 0.1.0

$  cargo run --quiet -- -f log.txt -C 10
error: the argument '--log-file <LOG_FILE>' cannot be used with '--count <COUNT>'

Usage: ex --log-file <LOG_FILE> [FILES]...

For more information, try '--help'.

The definition of Config starts by deriving the clap::Parser and Debug traits. The Parser trait is what provides the parse() function which reads the command-line arguments and returns an instance of Config.

The next line

#[command(author, version, about = "Example command line parsing", long_about = None)]

is new. It is attaching metadata to the Config struct that the clap library will use to do all of the argument parsing and producing usage and error messages. The about = "..." lets us specify a short about message to appear in the --help message.

Each of the arguments consists of three parts.

    /// Use colored output
    #[arg(short, long)]
    color: bool,

The comment with three slashes is a doc comment. Normally, such a comment is used to write documentation. (Running cargo doc produces HTML documentation of your code. The doc comments are where the documentation comes from.) Clap is repurposing that doc comment to construct a --help message.

The next line, #[arg(short, long)], is attaching some more metadata for Clap. It says this should have a short option -c and a long option --color. It took both option names from the name of the variable color itself. Note that we can be explicit about what the option names are. There are a variety of other things we can do including making options conflict. If we try to pass both -f and -C, we’ll get an error from Clap.

The third line is the variable and its type, of course. Because the type of color is bool, this acts as an on/off flag. For other types, using an Option<...> indicates that the argument is optional. If you want to allow 0 or more, you use a Vec rather than an Option as I did for files.

That’s a lot of information to take in! Fortunately, I’ve got a simple recipe to follow for parsing command line arguments:

Tip

Start by defining a new struct that derives Debug and Parser.

#[derive(Parser, Debug)]
#[command(author, version, about = "Short description", long_about = None)]
struct Config {
}

Now, for each option, modify one of the examples below and remember, you can change the short or long options by using short = 'c' or long = "blah", for example.

If you want a boolean flag, use a bool

  /// Boolean flag, defaults to false
  #[arg(short, long)]
  flag: bool,

If you want an option that takes an argument, use an Option

  /// Option that takes a String argument
  #[arg(short, long)]
  name: Option<String>,

If you want a path to a file, use an Option<PathBuf>

  /// Option that takes a file path
  #[arg(short, long)]
  path: Option<PathBuf>,

If you want a list of input files, don’t use short or long but do use Vec<PathBuf>; if you want to specify a default value, use default_value
```
  /// Input files but if none are specified, use -
  #[arg(default_value = "-")]
  files: Vec<PathBuf>,
```

If you want to specify that two options conflict, mark one of them (it doesn’t matter which) as conflicts_with the other

  /// An option that takes a path
  #[arg(short, long)]
  log: Option<PathBuf>,

  /// A flag that conflicts with option `--log`
  #[arg(short, long, conflicts_with = "log")]
  output: bool,

Finally, add a main function that prints out the debug representation. (Click the Run button to run this or the Show button to show the whole example.)

// You don't need the `extern crate clap` line. That's just to make this runnable on the web.
extern crate clap;
use std::path::PathBuf;

use clap::Parser;

#[derive(Parser, Debug)]
#[command(author, version, about = "Short description", long_about = None)]
struct Config {

    /// Boolean flag, defaults to false
    #[arg(short, long)]
    flag: bool,

    /// Option that takes a String argument
    #[arg(short, long)]
    name: Option<String>,

    /// Option that takes a file path
    #[arg(short, long)]
    path: Option<PathBuf>,

    /// Input files but if none are specified, use -
    #[arg(default_value = "-")]
    files: Vec<PathBuf>,

    /// An option that takes a path
    #[arg(short, long)]
    log: Option<PathBuf>,

    /// A flag that conflicts with option `--log`
    #[arg(short, long, conflicts_with = "log")]
    output: bool,
}

fn main() {
    let config = Config::parse();
    println!("{config:#?}");
}

The Clap tutorial has a lot more information.

Your task

Before you start, you need to create a new project inside your assignment repository and add the clap library.

$ cargo new head
$ cd head
$ cargo add clap --features derive

Your task is to implement parsing the head command line arguments and printing out a --help message. Your head will support the following options.

-c/--bytes;
-n/--lines;
-h/--help; and
-V/--version.

The --help and --version options are handled by clap automatically. Use clap’s conflicts_with mechanism for marking --bytes and --lines as conflicting. See the tip above.

Additionally, head takes 0 or more files as input. Configure clap such that if no files are passed, then it defaults to -. See the tip above for how to set default values.

Note

Many command line utilities use - in place of a file path to denote reading from stdin instead of the file or writing to stdout instead of writing to the file. Our head will support that convention.

Once you have this working, you should be able to run head --help and see this.

$ cargo run --quiet -- --help
Display the first few lines of a file

Usage: head [OPTIONS] [FILES]...

Arguments:
  [FILES]...  Input files or - for stdin [default: -]

Options:
  -n, --lines <LINES>  Print LINES lines of each of the specified files
  -c, --bytes <BYTES>  Print BYTES bytes of each of the specified files
  -h, --help           Print help
  -V, --version        Print version

The version number printed by --version can be set by editing the version field in Cargo.toml. Go ahead and set the version number to 1.2.3 so that you get this output.

$ cargo run --quiet -- --version
head 1.2.3