correct • elegant • free

△ Why Haskell Is My Favourite Programming Language △

◅ Types

Lazy ▻


or, Referential Transparency

Consider this simple statement:

x = x + 1;

I intended that as C, but it could be almost any language, perhaps with minor variations (even Scheme has the (set! x (+1 x)) equivalent). We all know what the statement means. But, then, what does x mean? It means one thing before that statement is executed, and a different thing afterwards.

Well, no, you're thinking, really x is not a value, but a reference to a location that holds a value. (Algol 68 had this about right, where the declaration of x in this case would be ref int x;.) And, y'know, this is a way of thinking about programming that I'm as used to as you are. It works. But it may not be the best possible way to write programs.

One thing that x = x + 1; clearly isn't, is the language of mathematics. (Some old-fashioned programming languages would use := here to indicate that something a bit fishy is going on, but Dennis Ritchie felt, rightly enough, that that's just pointless extra typing.)

At college, one of the modules was Denotational Semantics. It is an attempt to work out what computer programs actually mean. I enjoyed it not at all. Like formal program construction, it seemed to require an enormous amount of effort to say almost nothing about tiny toy programs. And a huge amount of the pain in semantics comes from the necessity of dealing with statements like x = x + 1;.

Haskell, as you've probably anticipated, takes a different approach: equals can be substituted for equals. In any given scope of a binding, a name has precisely one meaning. Haskell is, in this respect, a bit like a restricted subset of C where you cannot say int x;, only int x = <something>, and you are not allowed to assign a different value to x afterwards. (In this bizarre world you're not allowed to assign to function arguments either.) Haskell is also rather like mathematics.

As a trivial example, in the last chapter we had:

asciiList str = map ord str

The first thing a mathematician, or a Haskell programmer, would do with that equation is to knock the str off both sides. And indeed, if a seasoned Haskell programmer bothered to define asciiList at all, they would undoubtedly write:

asciiList = map ord

Referential transparency, or "equals can be substituted for equals", has far-reaching consequences. For a start, it means thinking in a quite different way about programming (not unlike the way you have to think in SQL, or XSLT, or maths): it's not just a cutsey trick to write asciiList = map ord. There is a cost to thinking this way, but also enormous benefits.

Referential transparency means that the formal meaning of your programs is much clearer and simpler. You might not care too much about formal semantics (I don't) but this also means that the compiler has a much clearer idea what your program means. Instead of being a list of instructions which will only work if followed in order, it becomes a description of a computation, which the compiler is able to optimize in ways which knock cc -O2 into a cocked hat (again, not unlike a SQL query optimizer). We'll see some examples soon.

There is a huge class of nasty bugs that disappear with referential transparency. It enables understandable parallelization, and I strongly suspect that parallel computing would already be more important than it is if only we had languages that made it more understandable (and I'm certain that parallel computing will become ever more important in the future)

Incidentally, you can say in Haskell:

let x = x + 1 in x

but that actually represents an infinite loop. In the language of formal semantics, the value of x is ⊥ (which should look a bit like _|_), also known as "bottom". What ⊥ really means is that x doesn't have a value: the computation either fails to terminate (as in this case), or results in a run-time error. (As an example of the latter, head [] ⇒ ⊥, which you should read as "head of the empty list reduces to bottom", this time with a run-time error).

Finally, the most important thing about referential transparency is that it enables lazy evaluation, but that's the subject of the next chapter.

△ Why Haskell Is My Favourite Programming Language △

◅ Types

Lazy ▻