Part 2. Parallelizing jfrac - CSCI 241 Systems Programming Labs

Part 2. Parallelizing jfrac (20 points)

In this part, you’re going to modify a version of the jfrac code from lab 2 that produces animated gifs of the Julia set fractal. Specifically, you’re going to turn it into a multi-threaded program where each thread will generate some of the frames of the animated gif.

The provided code in the jfrac directory can be run with several new options. Here’s the output from --help.

Usage: jfrac [OPTIONS] <OUTPUT>

Arguments:
  <OUTPUT>  Output file path

Options:
  -s, --size <SIZE>                  Sets the width and height of the image [default: 800]
  -c, --constant <CONSTANT>          Sets the constant `c` in the equation `f(z) = z^2 + c` [default: "-0.8 + 0.156i"]
  -i, --initial-zoom <INITIAL_ZOOM>  Sets the initial zoom [default: 1.0]
  -f, --final-zoom <FINAL_ZOOM>      Sets the final zoom [default: 20.0]
  -n, --frames <FRAMES>              Sets the number of frames in the gif [default: 100]
  -t, --threads <THREADS>            Sets the number of threads to use [default: 1]
  -h, --help                         Print help

If you run the code with the default options, it will take a fairly long time to run, nearly 5 minutes for me. We can time it using the time shell command.

$ time cargo run -- output.gif
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/jfrac output.gif`
cargo run -- output.gif  293.09s user 0.33s system 99% cpu 4:55.70 total

We can speed it up substantially by compiling our code using compiler optimizations. To do that, pass the --release argument to cargo build and cargo run. After $ cargo build --release, running the code still takes about 25 seconds for me.

$ time cargo run --quiet --release -- output.gif
cargo run --quiet --release -- output.gif  25.51s user 0.08s system 99% cpu 25.827 total

You should run this yourself and look at the produced output.gif to get a sense of what the program is doing.

We can do better by making use of threads.

Tip

While testing your code, you may wish to change the number of frames of animation using the --frames command line argument.

Your task

Your task is to finish the program in main.rs by handling the case where the number of threads the user selects via the --threads option is greater than 1. (There is currently a todo!() marking where your code needs to go.)

Let’s look at the main loop in the program.

for frame_num in 0..config.frames {
    let zoom = config.initial_zoom
        * zoom_factor.powf(frame_num as f32 / (config.frames - 1) as f32);
    let frame = make_frame(&config, zoom);
    encoder.encode_frame(frame)?;
}

The two key pieces here are creating a frame of animation by calling make_frame(&config, zoom) which creates the fractal at the specified level of zoom and encoding that frame using the Gif encoder.

You will need to write similar code for the case where the number of threads is greater than 1.

When parallelizing code in general, you’ll need to identify which portions of the task can be handled concurrently and which must be handled serially. In this case, this is easy: Each frame of animation can be created independently of the other frames but encoding the gif must happen serially since we need the frames to appear in order.

This suggests the following steps.

In a loop, create config.threads number of threads, each of which is responsible for creating a Vec<Frame> containing (approximately) config.frames / config.threads frames of animation.
In a second loop, wait for the threads to finish by using join() (see below) and encode the frames returned by each thread in order.

Both steps will need to occur inside the body of the closure that we pass to thread::scope().

thread::scope(|s| -> Result<()> {
    // Step 1.
 
    // Step 2.

    Ok(())
})?;

Notice that this code contains an explicit return type on the closure. The Rust compiler can often deduce return types, but not always. Since step 2 involves calling encoder.encode_frame() which can return a Result, we can propagate errors by using ? if we mark the closure as returning a Result.

For step 1, you’ll need to figure out what range of frames to create for each thread. One approach is to have a loop like this.

for thread_num in 0..config.threads {
    let start_frame = thread_num * config.frames / config.threads;
    let end_frame = (thread_num + 1) * config.frames / config.threads;
    s.spawn(move || { 
        todo!("Create frames in the range start_frame..end_frame");
    });
}

This ensures all frames will be generated and each thread will have approximately the same number of frames to generate. In fact, the number of frames per thread will differ by at most 1 using this approach. For example, generating 100 frames using 8 threads with this approach gives the following frame ranges per thread.

Thread 0: 0..12
Thread 1: 12..25
Thread 2: 25..37
Thread 3: 37..50
Thread 4: 50..62
Thread 5: 62..75
Thread 6: 75..87
Thread 7: 87..100

Note the use of the move keyword when spawning threads. This causes values from outside the closure that are referenced inside the closure to be moved (or copied) into the closure. E.g., start_frame and end_frame will be copied into the closure. Without the move, the compiler will give an error about lifetimes. (See the note at the end of this page when you get an error about using the moved value of config.)

For step 2, you’ll need to wait for each thread to finish and return the frames contained in the Vec<Frame> you created in step 1. To support waiting for threads to finish and for getting their return values, the s.spawn() method returns a thread handle. You should insert that handle into a vector of handles for use in step 2.

thread::scope(|s| -> Result<()> {
    let mut handles = Vec::with_capacity(config.threads);

    // Step 1.
    for thread_num in 0..config.threads {
        // ...
        let handle = s.spawn(move || { /* ... */ });
        handles.push(handle);
    }

    // Step 2.
    for handle in handles {
        // ...
    }

    Ok(())
})?;

In the loop for step 2, you can use let frames = handle.join().unwrap() to wait for the thread to finish and return the vector of frames. You should then encode each of the frames contained in the returned vector.

You can test your code by running the program and supplying a value for the threads option: cargo run --release -- --threads {number of threads} output.gif. Note that you should use the optimized release build rather than the unoptimized debug build.

Note

There’s one final little detail you’ll need to handle: The move keyword you needed to use when creating the threads tried to move the config value into the thread. You actually only want a shared reference to config since each thread needs to use it.

You can force Rust to use a reference by shadowing the config variable with a reference to config. You’ll want something like this.

thread::scope(|s| -> Result<()> {
    let mut handles = Vec::with_capacity(config.threads);
    let config = &config;

    // Step 1.

    // Step 2.

    Ok(())
})?;

With the let config = &config, the move || { } closure will make a copy of the reference for use in each thread rather than trying to move the underlying Config structure itself into the thread.