This Week's Useless Use of Cat Award goes to...
In article <3E5211D2138E78A4.B962F8C82AB31411.C9CD600523197EF3@lp.airnews.net>, <shmartonak@ticnet.com> wrote: >I'm more than willing to change my programming style/preference, (and my >car and my preference of ice cream) but it would really help if someone >could tell me why the use of 'cat' is bad. Of course, it's only "useless" uses of `cat' that are bad; there are some occasions where you really need `cat', and then it's the best thing to use. It's bad because it makes the system do more work than is necessary. We were comparing these two examples. cat foo |wc -l wc -l <foo To execute the first, the shell must create a pipe, fork twice, manipulate four file descriptors, and exec two programs. To execute the second, the shell must fork once, manipulate one file descriptor, and exec one program. On my system, the difference adds up to over 1 second in just 100 iterations. $ time sh -c 'for i in 0 1 2 3 4 5 6 7 8 9; do for j in 0 1 2 3 4 5 6 7 8 9; do cat /dev/null |wc -l done done' ... 1.07user 0.79system 0:02.96elapsed 62%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (21840major+6459minor)pagefaults 0swaps $ time sh -c 'for i in 0 1 2 3 4 5 6 7 8 9; do for j in 0 1 2 3 4 5 6 7 8 9; do wc -l </dev/null done done' ... 0.54user 0.45system 0:01.93elapsed 51%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (11494major+3470minor)pagefaults 0swaps Furthermore, the version with the useless `cat' does much more work in actually counting the file. $ time sh -c 'cat vmware/nt4/nt4.dsk |wc -l' 3285345 6.14user 15.34system 1:56.09elapsed 18%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (162845major+152minor)pagefaults 64swaps $ time sh -c 'wc -l <vmware/nt4/nt4.dsk' 3285345 5.99user 11.15system 1:56.05elapsed 14%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (165559major+69minor)pagefaults 0swaps If you're tempted to dismiss these as pointless micro-optimizations, remember that even on today's systems it's easy to write a shell script which runs so slowly as to be useless. Removing a useless use of `cat' from an inner loop might just help. For one-shot commands, the difference is not worth worrying about. But even then, the `cat'-less version has the compelling advantage that it requires four fewer characters to type! :-) Tim. -- Tim Goodwin | "If you don't know what closures are, you probably don't Leicester, UK | want to know what closures are." -- Larry Wall
Original headers:
From: tjg@star.le.ac.uk (Tim Goodwin) Newsgroups: comp.unix.shell Subject: Re: This Week's Useless Use of Cat Award goes to... (was Re: grep - show n lines after find) Date: 7 Feb 2000 15:51:51 -0000 Message-ID: <87mpnd$j88$1@ltpcg.star.le.ac.uk> References: <86skss$66t@romeo.logica.co.uk> <874820D8E7972ED0.3159BF20EF1CBFBF.644B22C55A63193C@lp.airnews.net> <FpJoFn.DzB@world.std.com> <3E5211D2138E78A4.B962F8C82AB31411.C9CD600523197EF3@lp.airnews.net>