Part 2. Parallelizing jfrac (20 points)
In this part, you’re going to modify a version of the jfrac
code from lab 2 that produces animated gifs of the Julia set fractal. Specifically, you’re going to turn it into a multi-threaded program where each thread will generate some of the frames of the animated gif.
The provided code in the jfrac
directory can be run with several new options. Here’s the output from --help
.
Usage: jfrac [OPTIONS] <OUTPUT>
Arguments:
<OUTPUT> Output file path
Options:
-s, --size <SIZE> Sets the width and height of the image [default: 800]
-c, --constant <CONSTANT> Sets the constant `c` in the equation `f(z) = z^2 + c` [default: "-0.8 + 0.156i"]
-i, --initial-zoom <INITIAL_ZOOM> Sets the initial zoom [default: 1.0]
-f, --final-zoom <FINAL_ZOOM> Sets the final zoom [default: 20.0]
-n, --frames <FRAMES> Sets the number of frames in the gif [default: 100]
-t, --threads <THREADS> Sets the number of threads to use [default: 1]
-h, --help Print help
If you run the code with the default options, it will take a fairly long time to run, nearly 5 minutes for me. We can time it using the time
shell command.
$ time cargo run -- output.gif
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
Running `target/debug/jfrac output.gif`
cargo run -- output.gif 293.09s user 0.33s system 99% cpu 4:55.70 total
We can speed it up substantially by compiling our code using compiler optimizations. To do that, pass the --release
argument to cargo build
and cargo run
. After $ cargo build --release
, running the code still takes about 25 seconds for me.
$ time cargo run --quiet --release -- output.gif
cargo run --quiet --release -- output.gif 25.51s user 0.08s system 99% cpu 25.827 total
You should run this yourself and look at the produced output.gif
to get a sense of what the program is doing.
We can do better by making use of threads.
While testing your code, you may wish to change the number of frames of animation using the --frames
command line argument.
Your task
Your task is to finish the program in main.rs
by handling the case where the number of
threads the user selects via the --threads
option is greater than 1. (There
is currently a todo!()
marking where your code needs to go.)
Let’s look at the main loop in the program.
for frame_num in 0..config.frames {
let zoom = config.initial_zoom
* zoom_factor.powf(frame_num as f32 / (config.frames - 1) as f32);
let frame = make_frame(&config, zoom);
encoder.encode_frame(frame)?;
}
The two key pieces here are creating a frame of animation by calling make_frame(&config, zoom)
which creates the fractal at the specified level of zoom and encoding that frame using the Gif encoder.
You will need to write similar code for the case where the number of threads is greater than 1.
When parallelizing code in general, you’ll need to identify which portions of the task can be handled concurrently and which must be handled serially. In this case, this is easy: Each frame of animation can be created independently of the other frames but encoding the gif must happen serially since we need the frames to appear in order.
This suggests the following steps.
- In a loop, create
config.threads
number of threads, each of which is responsible for creating aVec<Frame>
containing (approximately)config.frames / config.threads
frames of animation. - In a second loop, wait for the threads to finish by using
join()
(see below) and encode the frames returned by each thread in order.
Both steps will need to occur inside the body of the closure that we pass to thread::scope()
.
thread::scope(|s| -> Result<()> {
// Step 1.
// Step 2.
Ok(())
})?;
Notice that this code contains an explicit return type on the closure. The
Rust compiler can often deduce return types, but not always. Since step 2
involves calling encoder.encode_frame()
which can return a Result
, we can
propagate errors by using ?
if we mark the closure as returning a Result
.
For step 1, you’ll need to figure out what range of frames to create for each thread. One approach is to have a loop like this.
for thread_num in 0..config.threads {
let start_frame = thread_num * config.frames / config.threads;
let end_frame = (thread_num + 1) * config.frames / config.threads;
s.spawn(move || {
todo!("Create frames in the range start_frame..end_frame");
});
}
This ensures all frames will be generated and each thread will have approximately the same number of frames to generate. In fact, the number of frames per thread will differ by at most 1 using this approach. For example, generating 100 frames using 8 threads with this approach gives the following frame ranges per thread.
Thread 0: 0..12
Thread 1: 12..25
Thread 2: 25..37
Thread 3: 37..50
Thread 4: 50..62
Thread 5: 62..75
Thread 6: 75..87
Thread 7: 87..100
Note the use of the move
keyword when spawning threads. This causes values
from outside the closure that are referenced inside the closure to be moved
(or copied) into the closure. E.g., start_frame
and end_frame
will be
copied into the closure. Without the move
, the compiler will give an error
about lifetimes. (See the note at the end of this page when you get an error about using the moved value of config
.)
For step 2, you’ll need to wait for each thread to finish and return the
frames contained in the Vec<Frame>
you created in step 1.
To support waiting for threads to finish and for getting their return
values, the s.spawn()
method returns a thread handle. You should insert
that handle into a vector of handles for use in step 2.
thread::scope(|s| -> Result<()> {
let mut handles = Vec::with_capacity(config.threads);
// Step 1.
for thread_num in 0..config.threads {
// ...
let handle = s.spawn(move || { /* ... */ });
handles.push(handle);
}
// Step 2.
for handle in handles {
// ...
}
Ok(())
})?;
In the loop for step 2, you can use let frames = handle.join().unwrap()
to
wait for the thread to finish and return the vector of frames. You should then
encode the each of the frames contained in the returned vector.
You can test your code by running the program and supplying a value for the threads option: cargo run --release -- --threads {number of threads} output.gif
. Note that you should use the optimized release build rather than
the unoptimized debug build.
There’s one final little detail you’ll need to handle: The move
keyword you needed to use when creating the threads tried to move the config
value into the thread. You actually only want a shared reference to config
since each thread needs to use it.
You can force Rust to use a reference by shadowing the config
variable with a reference to config
. You’ll want something like this.
thread::scope(|s| -> Result<()> {
let mut handles = Vec::with_capacity(config.threads);
let config = &config;
// Step 1.
// Step 2.
Ok(())
})?;
With the let config = &config
, the move || { }
closure will make a copy of
the reference for use in each thread rather than trying to move the underlying
Config
structure itself into the thread.