tobold.org

correct • elegant • free

△ How to do IO in Haskell △

◅ The Handle type

IO encumbrance can be cumbersome ▻

Error Handling

In many programming languages, IO gets smothered under the extra code required to check for and handle possible errors. (Even the classic Hello, world! program in C is arguably incomplete, as there is no error checking.) Another minor miracle of the IO monad in Haskell is that it provides basic error handling totally for free.

So far, we have seen that IO actions return an "IO encumbered" value (which may be (), or something more exciting). But any IO action can also fail. An IO failure is "out of band": you do not have to examine the return value to see if the operation failed. Indeed, there won't be a return value!

Suppose we have a do expression containing a number of IO actions, and one of them fails. The rest of the do expression will not be evaluated, instead the entire do expression (which is itself an IO action) immediately fails. The failure propagates upwards to main, and beyond: when main fails, the run-time system prints an error message, and the program terminates.

We'll use a new example to explore error handling. Here's a program, interest0.hs, which - for each file given on the command line - uses a simple test to identify whether or not it is interesting.

import Data.List (isInfixOf)
import System.Environment (getArgs)

main = do
  args <- getArgs
  mapM_ identify args

identify :: String -> IO ()
identify f = do
  putStr (f ++ ": ")
  g <- isInteresting f
  putStrLn (if g then "interesting" else "boring")

isInteresting :: String -> IO Bool
isInteresting f = do
  x <- readFile f
  return ("Haskell" `isInfixOf` x)

We can provoke an IO failure from this program simply by asking it about a file which does not exist:

$ runhugs interest0.hs index.txt helloworld.hs NONESUCH age.hs
index.txt: interesting
helloworld.hs: boring
NONESUCH:
Program error: NONESUCH: IO.openFile: does not exist (file does not exist)

As expected, the IO failure causes the program to terminate with an error message immediately. Note that the failure is generated in the function isInteresting, but it causes an early exit from the do expression in identify (the putStrLn statement is not executed), and also from the mapM_ in main (subsequent files are not examined).

For many programs, this default handling of IO failures is ideal. At the very least, it's a good default. But sometimes we need to take control and recover gracefully from failure. In the case of this example, we would like to report non-existent (or otherwise unreadable) files, and then proceed to test the remaining files.

The function catch takes two arguments, so let's say we've invoked it as catch action handler. The first argument, action, is an IO action to be performed. If the IO action succeeds, catch simply returns its value. But if the IO action fails, then instead of the usual failure propagation, the function handler, another IO action, is invoked. The handler function receives a single argument: we'll see what this is in a moment. It returns a value of the same type as action. Armed with this knowledge, we can start improving the example. Here's interest1.hs.

import Control.Exception (IOException, catch)
import Data.List (isInfixOf)
import System.Environment (getArgs)

main = do
  args <- getArgs
  mapM_ identify args

identify :: String -> IO ()
identify f = do
  putStr (f ++ ": ")
  g <- catch (isInteresting f) handler
  putStrLn (if g then "interesting" else "boring")
  where
    handler :: IOException -> IO Bool
    handler e = do
             putStr "(unreadable) "
             return False

isInteresting :: String -> IO Bool
isInteresting f = do
  x <- readFile f
  return ("Haskell" `isInfixOf` x)

And this is what it looks like in action:

$ runhugs interest1.hs index.txt helloworld.hs NONESUCH age.hs
index.txt: interesting
helloworld.hs: boring
NONESUCH: (unreadable) boring
age.hs: interesting

Not perfection, but a step in the right direction! At least the program now continues and examines all the files. There are a couple of obvious improvements to be made, though. First, it would be nice to get more information about why the file is unreadable. Secondly, it's a bit presumptuous to say that every unreadable file is boring, but we've painted ourselves into a corner by using Bool types: there simply is no room (in the type!) for anything other than True or False. Remember that the handler must have the same return type as the action.

For more information about the failure, we need to examine the argument to our handler function. This is of type IOError. There are various things we can do with a value of type IOError, which we'll come to soon. For now, we will simply show (or print) it, which should produce a reasonable error message for human consumption.

To avoid the conclusion that all unreadable files are boring, we need to use a type with more than two values. We could define a new type, but there is an obvious candidate that already exists in Haskell: the type Maybe Bool, which of course has three possible values (Just True, Just False, and Nothing). This brings us to interest2.hs.

import Control.Exception (IOException, catch)
import Data.List (isInfixOf)
import System.Environment (getArgs)

main = do
  args <- getArgs
  mapM_ identify args

identify :: String -> IO ()
identify f = do
  g <- catch (isInteresting f) handler
  case g of
    Just h -> putStrLn (f ++ ": " ++ if h then "interesting" else "boring")
    Nothing -> return ()
  where
    handler :: IOException -> IO (Maybe Bool)
    handler e = do
               print e
               return Nothing

isInteresting :: String -> IO (Maybe Bool)
isInteresting f = do
  x <- readFile f
  return (Just ("Haskell" `isInfixOf` x))

And in action:

$ runhugs interest2.hs index.txt helloworld.hs NONESUCH age.hs
index.txt: interesting
helloworld.hs: boring
NONESUCH: IO.openFile: does not exist (file does not exist)
age.hs: interesting

This is the effect we were after, but the code seems a little hard to follow. An alternative implementation uses the Either type constructor, which is defined in the standard prelude. The Either type constructor is very similar to the Maybe type constructor, but it also has room for a reason why there is a missing value: ideal for error handling. Conventionally, an operation which can either return a result or fail returns the result as a Right value of the Either type (like Just), or the reason for failure as a Left value (like Nothing, but with extra information). Yes, this is a rather weak pun on "right" as opposed to both "left" and "wrong".

The Left and Right sides of an Either type are independent: they can, and usually will, be of different types. So here's interest3.hs, using the type Either IOError Bool and - in my opinion - looking a little cleaner than our previous version.

import Data.List (isPrefixOf)
import System.Environment (getArgs)
import System.IO (hPrint, stderr)

main = do
  args <- getArgs
  mapM_ identify args

identify f = do
  g <- isInteresting f
  case g of
    Right h -> putStrLn (f ++ ": " ++ if h then "interesting" else "boring")
    Left e  -> hPrint stderr e

isInteresting f = catch action handler
    where
      action = do
        x <- readFile f
        return (Right (x `contains` "Haskell"))
      handler e = return (Left e)

[] `contains` _ = False
(x:xs) `contains` y = y `isPrefixOf` (x:xs) || xs `contains` y

As a bonus, in this version the errors are now written to standard error (with hPrint stderr), as is conventional. Otherwise, this version behaves identically to the previous one.

We are getting cannier in our handling of errors, but it is possible to be more subtle yet. As an example problem, consider the rc shell, which reads an initialization file $HOME/.rcrc on startup. We do not consider it a problem if this file doesn't exist, but we should warn the user if the file exists, but cannot be read (most likely because it has faulty permissions). How can we do this in Haskell? (I should point out that rc is written in C, not Haskell!)

We need a way to distinguish different possible errors; in this case, we need to handle a "file does not exist" error differently from any other error. There is a whole slew of functions in System.IO.Error that examine error values: the one we want is isDoesNotExistError :: IOError -> Bool. This function takes a value of type IOError, in other words the argument to our catch handler function, and returns True if the error represents "file does not exist", otherwise false.

So here's rcrc.hs, which looks for the user's .rcrc file (using System.Environment.getEnv to discover their home directory). If the file can be read, it is copied to standard output. If an error occurs, the error message is written to standard error in the usual way, unless the error was "file does not exist", in which case it is silently ignored.

import Control.Exception (IOException, catch)
import Control.Monad (unless)
import System.Environment (getEnv)
import System.IO (hPrint, stderr)
import System.IO.Error (isDoesNotExistError)

main = do
  home <- getEnv "HOME"
  let f = home ++ "/.rcrc"
  catch (runrcrc f) norcrc

runrcrc :: FilePath -> IO ()
runrcrc f = do
  x <- readFile f
  putStr x

norcrc :: IOException -> IO ()
norcrc e = unless (isDoesNotExistError e) (hPrint stderr e)

Here it is in action:

$ echo Hello, world! > $home/.rcrc # readable file...
$ runghc rcrc.hs                   # ...is copied to stdout
Hello, world!

$ chmod 0 $home/.rcrc              # unreadable file...
$ runghc rcrc.hs                   # ...provokes an error
/home/libra/.rcrc: openFile: permission denied (Permission denied)

$ rm $home/.rcrc                   # non-existent file...
$ runghc rcrc.hs                   # ...is silently ignored
$

Here is the complete list of functions for testing error values.

These ungainly-named functions are defined in System.IO.Error for interrogating error values in a catch handler function. They all have type IOError -> Bool, and return True iff the error value represents an error of the appropriate type.

isAlreadyExistsError :: IOError -> Bool
The operation failed because one of its arguments does not exist. For example, if you createDirectory "/tmp/", you will get an "already exists" error (at least on any sane Unix box!).
isDoesNotExistError :: IOError -> Bool
The operation failed because one of its arguments does not exist. For example, if you createDirectory "/tmp/foo/bar", you will get a "does not exist" error (unless you happen to have a directory called /tmp/foo!).
isAlreadyInUseError :: IOError -> Bool
The operation failed because one of its arguments is a single-use resource, which is already being used. This is a slightly hard one to provoke, but do { writeFile "/tmp/foo" "foo"; x <- readFile "/tmp/foo"; writeFile "/tmp/foo" "qux" } does the job. The reason for this is that readFile reads the file lazily, so since we haven't used its result (the value of x), the file is still in a "semi-closed" state. Thus the second writeFile fails with this error.
isFullError
The operation failed because the device is full.
isEOFError
The operation failed because the end of file has been reached.
isIllegalOperation
A catch-all error: the operation was not possible.
isPermissionError
The operation failed because the user does not have sufficient operating system privilege to perform that operation.
isUserError
A programmer-defined error value has been raised using fail.

△ How to do IO in Haskell △

◅ The Handle type

IO encumbrance can be cumbersome ▻