tobold.org

correct • elegant • free

△ comp.unix.shell △

◅ filesize -> variable

using mknod ▻

This Week's Useless Use of Cat Award goes to...

In article <3E5211D2138E78A4.B962F8C82AB31411.C9CD600523197EF3@lp.airnews.net>,
 <shmartonak@ticnet.com> wrote:
>I'm more than willing to change my programming style/preference, (and my
>car and my preference of ice cream) but it would really help if someone
>could tell me why the use of 'cat' is bad.

Of course, it's only "useless" uses of `cat' that are bad; there are
some occasions where you really need `cat', and then it's the best thing
to use.

It's bad because it makes the system do more work than is necessary.  We
were comparing these two examples.

    cat foo |wc -l

    wc -l <foo

To execute the first, the shell must create a pipe, fork twice,
manipulate four file descriptors, and exec two programs.

To execute the second, the shell must fork once, manipulate one file
descriptor, and exec one program.

On my system, the difference adds up to over 1 second in just 100
iterations.

    $ time sh -c 'for i in 0 1 2 3 4 5 6 7 8 9; do for j in 0 1 2 3 4 5 6 7 8 9; do
      cat /dev/null |wc -l
      done done'
    ...
    1.07user 0.79system 0:02.96elapsed 62%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (21840major+6459minor)pagefaults 0swaps

    $ time sh -c 'for i in 0 1 2 3 4 5 6 7 8 9; do for j in 0 1 2 3 4 5 6 7 8 9; do
      wc -l </dev/null
      done done'
    ...
    0.54user 0.45system 0:01.93elapsed 51%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (11494major+3470minor)pagefaults 0swaps

Furthermore, the version with the useless `cat' does much more work in
actually counting the file.

    $ time sh -c 'cat vmware/nt4/nt4.dsk |wc -l'
    3285345
    6.14user 15.34system 1:56.09elapsed 18%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (162845major+152minor)pagefaults 64swaps

    $ time sh -c 'wc -l <vmware/nt4/nt4.dsk'
    3285345
    5.99user 11.15system 1:56.05elapsed 14%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (165559major+69minor)pagefaults 0swaps

If you're tempted to dismiss these as pointless micro-optimizations,
remember that even on today's systems it's easy to write a shell script
which runs so slowly as to be useless.  Removing a useless use of `cat'
from an inner loop might just help.

For one-shot commands, the difference is not worth worrying about.  But
even then, the `cat'-less version has the compelling advantage that it
requires four fewer characters to type! :-)

Tim.
--
Tim Goodwin   | "If you don't know what closures are, you probably don't
Leicester, UK | want to know what closures are." -- Larry Wall

Original headers:

From: tjg@star.le.ac.uk (Tim Goodwin)
Newsgroups: comp.unix.shell
Subject: Re: This Week's Useless Use of Cat Award goes to... (was Re: grep
  - show n lines after find)
Date: 7 Feb 2000 15:51:51 -0000
Message-ID: <87mpnd$j88$1@ltpcg.star.le.ac.uk>
References: <86skss$66t@romeo.logica.co.uk>
  <874820D8E7972ED0.3159BF20EF1CBFBF.644B22C55A63193C@lp.airnews.net>
  <FpJoFn.DzB@world.std.com>
  <3E5211D2138E78A4.B962F8C82AB31411.C9CD600523197EF3@lp.airnews.net>

△ comp.unix.shell △

◅ filesize -> variable

using mknod ▻