cat is a standard Unix shell command that “conCATenates” files.
It’s popular for midwit Linux enthusiasts to complain about “Useless Use of Cat” for shell commands that don’t actually need it. The classic example is:
cat myfile.txt | grep foo
The use of cat here is “useless”, because this command can be equivalently written using grep directly:
grep foo myfile.txt
This is a very midwit thing to complain about. It’s midwit, because they’re entirely correct: the command can be written without cat. But they’re wrong about the main point, which is that the first version is better.
The actual purpose of cat
Per the erroneous “Useless Use of Cat Award” site, linked above:
Briefly, here’s the collected wisdom on using
cat:The purpose of
catis to concatenate (or “catenate”) files. If it’s only one file, concatenating it with nothing at all is a waste of time, and costs you a process.
You’ll often see these people give an illustration like the following:
If you think adding cat for no reason makes it better, then look, I can make it even better for you:
cat myfile.txt | cat | cat | cat | grep fooSee, I added even more useless
cats!"
This is wrong, because the purpose of cat is NOT to concatenate files. The purpose of cat is to READ files.
Concatenating files is not a useful operation. You can just print the files in a loop. Here is cat implemented as a Bash function:
cat() {
for file in "$@"; do
while IFS= read -r line; do
printf '%s\n' "$line"
done < "$file"
done
}
Maybe cat was originally meant for concatenating; but its evolved purpose, the purpose for which it has been discovered as useful by programmers through collective, long-term experimentation, is to move files from the filesystem into stdout. If it happens to do this with several files at a time– hey, great.
Once a file has been moved to stdout, there’s 2 things you can do with it:
- you can leave it there, and let it get printed in the terminal; or
- you can pipe it to another program, as
stdin.
“Unix philosophy” and Pipes
Rule 1 of the “Unix philosophy” is: “Programs should do one thing, and do it well.” But actual Unix shell commands tend to degenerate into a pattern of, “Do 10 to 15 different-but-vaguely-related things, and let users pick one using an arcane web of command line flags.”
Almost every shell program supports reading input from either a filesystem file OR stdin. This is a clear violation of the Unix philosophy. The function of grep is to filter an input stream. Reading the filesystem is not really in scope! (There’s a caveat for stuff like grep -r, where it’s nice to have it include the filename along with the matching line.)
Pipes in Bash reflect the classic “good” use of the Unix philosophy. Everyone intuitively understands that the natural, most elegant thing for a CLI program to do is to read from stdin, and write to stdout. These streams are so special, they are literally numbered (respectively) File #0 and File #1 in Unix. A pipe, then, simply links the stdout of one program to the stdin of the next, creating beauty and harmony in your shell.
But, the filesystem continues to exist, which breaks this lovely symmetry down a bit. cat bridges the gap between filesystem and stdin/stdout, allowing the rest of the Bashverse to exist in a zen-like flow state.
So-called “useless” cat is actually quite useful
If you just want to grep a single file, cat | grep is more consistent with those programs’ respective “singular” purposes than grep <pattern> <filename>. When I try to do the latter, I get the order of “pattern” and “filename” wrong about 40% of the time. The fact that there are now 2 arguments is an indication that grep has been given too many responsibilities here.
This is closely related to the common defense that cat | grep (or cat | cmd1 | cmd2 | ...etc) is “easier to read”. Everyone implicitly knows that cat is for reading files, and reading files shouldn’t be the responsibility of other programs.
If you ever find yourself writing command-line tools, you also quickly discover that it’s easier to write cmd1 and cmd2 if you know that the input always comes in on stdin. Otherwise, it’s annoying and fiddly to try to guess whether you’re reading from a file or from stdin based on the order of arguments provided, and often having to run in a slightly different mode, based on which one it is, too.
Conclusion
Use cat myfile.txt | ... as the head of a Bash pipeline. It’s good.