Part 4. Top file types (20 points)
In this part, you’re going to write a complex pipeline to answer a simple question: What are the top 8 file types in the Linux 6.4 source code. We’re going to use a file’s extension to determine what type of file it is. We’re going to ignore any files that don’t have extensions.
Your task
Write a script topfiletypes
to answer this question. Your script should
consist of a single complex pipeline. Put the output of your script in your
README
.
The structure of the pipeline will look like this
while IFS= read -r file; do ...; done < <(find linux-6.4 -name '*.*') | ... | ... | ... | ...
Notice how the output from the while
loop is passed as standard input to the
next command in the pipeline.
Your output should look like this.
$ ./topfiletypes
32463 c
23745 h
3488 yaml
...
- The
while
loop given here is slightly different from the one you used before (no-d ''
or-print0
options). Most filenames don’t actually have newlines in them so we can use this technique to read one line at a time from the output of thefind
command rather than reading up to a 0 byte. - Bash supports a lot of convenient variable
expansions.
You want to read the section of that page that describes
${parameter##word}
. Here are some examples:
You can use this inside your$ x=foo.bar.baz $ echo "${x##*.}" baz $ y=linux-6.4/mm/mempool.c $ echo "${y##*.}" c
while
loop to print the extension of the file. - The
sort
command can be used to sort lines of input. For example, given the fileexample
if we runfoo bar foo foo cat cat foo
$ sort example
, we getbar cat cat foo foo foo foo
- The
uniq
command can be used to compress identical, consecutive lines of input into a single line of output.sort
anduniq
are frequently used together assort | uniq
to read lines of input, sort them, and then only output the unique lines. Running$ uniq example
produces
Runningfoo bar foo cat foo
$ sort example | uniq
givesbar cat foo
- The
uniq
command can take an argument to print out the count of each line. Here’s$ sort example | uniq -c
.1 bar 2 cat 4 foo
sort
can sort in reverse as well as performing a numeric sort (i.e., sort numbers in the usual way so that 9 comes before 10).- The
head
command can be used to print the first several lines of a file. Look up the options to print the first 8 rather than the default 10. - In the loop, print out the extension of the path in the
file
variable. Use a combination ofsort
,uniq
, andhead
in the pipeline to produce the final result. You’ll want to usesort
multiple times. - You can write this all as a single line. Don’t do that. Use
\
at the end of each line to continue on the next line.while IFS= read -r file; do echo ... done < <(find linux-6.4 -name '*.*') \ | ... \ | ... \ | ... \ | ...