Here’s a puzzler for all you shell-heads (you know who you are). Normal souls, please move along — nothing interesting to see here.
OK. You’re sitting in the parent of “dirname.” Inside dirname and its children are files that you know contain the string “string.” You want a text file listing the names of all those files. You run:
grep -r “string” dirname > dirname/output.txt
One of two things happens:
1) A few seconds later you have the file listing you need.
2) The command runs forever and output.txt grows indefinitely, until you run out of disk space.
As I discovered the hard way, which of these two occurs depends on which version of grep is installed on your system. In 2.5.x, you get outcome #1. Any version prior to that, you get outcome #2. On closer inspection, it’s easy to see what’s happening — grep is greedy, and is scanning the output file even as the shell is appending grep’s results to it. Reading itself and simultaneously reporting into itself. Devilish. Fortunately I spotted my error before I overflowed the drive. And sending output to any location outside of dirname avoids the problem, of course.
But here’s the puzzler: How was this fixed in grep 2.5? grep is not doing the output redirection — the shell is. grep only knows to pass results to stdout. Beyond that is a black box. So how is grep 2.5 able to avoid the problem of infinite recursion? How was it made aware of what the shell is doing? Cue Twilight Zone intro music.
Freaky deaky, super geeky.
