Better living through immutability or What have functional programmers ever done for us?

For more than thirty years, alongside conventional language development, there’s been a small but thriving culture of functional languages.  These are not like the languages we all know and love.  Their most obvious property is that they contain no variables at all; instead, they describe truths about the problem to be solved[1].

As anyone with even a passing familiarity with functional programming knows (and I accept that passing familiarity usually comes only after three to five years), things in a functional language don’t do anything (the way procedures execute or variables change in conventional languages); instead they are something, and what they are never changes.  A term may describe a function which defines a list of primes, or it may describe what it means to be an expression, or what-have-you.  But whatever it is, it describes it once and for all time – the function is the list of primes, it’s not a sequence of operations to compute the list.

Proponents of functional languages claim that by talking exclusively in terms of values, you get shorter, simpler, and more reliable programs.  If things are things, rather than do things, they reason, then if they ever work, they’ll always work.  If nothing ever changes, where can a bug hide?

Most programmers don’t do functional.  Most programmers are perfectly happy to continue using conventional, commonsense, imperative languages, and to them, functional programming is an arcane mystery.  Why bother learning Haskell when C and its descendants can do everything you need?

But that’s missing the point, because we all borrow ideas from functional languages all the time:

  • makefiles contain statements (from the general to the specific) about how to compile and link; the system works out the actual build sequence;
  • yacc bison, and lemon[2] describe grammars, and completely hide the underlying parsing process;
  • SQL queries are translated on-the-fly into code which efficiently  interrogates a database, and
  • xslt (like Haskell) pattern-matches fragments of xml to assemble a whole new document.

No variables or explicit evaluation in sight in any of these.

But those things are not what this essay is about.  In this essay, I want to take the functional programmers at their word, and ask: how would it be if (within a conventional, imperative, object paradigm) we were to create objects which never changed; objects which are something, but don’t do anything;  objects with value, but no state?  What would that be like?

Actually, it would be very familiar.  Consider the following:

int five = 2+3;

We know (because we’re all experienced, sophisticated, grown-up programmers here) that this actually means something like:

five = new int;
new int(2).operatorAdd(new int(3)).operatorCopy (&five);
// cleanup / garbage-collect.

A few observations, if I may:

  • When we make one of these ints (by saying 2, or equivalently: new int (2)), we’re making something completely sealed.  We’re not claiming that the 2 is in any sense its initial state.  A 2 is what it is, and what it always will be.
  • The operatorAdd method, though it’s defined by the int class and exported from the int(2) object, does not act on the object at all.  It’s not changing the 2 into a 5.  It’s creating a whole new int, 5, which is independent of either the 2 or the 3 which gave rise to it.
  • In general, arithmetic is the easiest bit of any program to get right.  The basic semantics haven’t changed since the days of Fortran.  We never, ever have to ask about the current state of a number or who its owner is.  Numbers just are.

These ints, these classes whose states never change, are called immutable, and they’re very close in spirit to what goes on in functional languages.

But, we tend not to write our own code this way. We tend to use objects to collect state.  Something like this:

conventionalFileReader f;
f.setFileName (filename);
f.openFile ();
while (!f.eof) {
   string s;
   s = f.readLine ();
   // do something with s;
}
f.close();

That looks like pretty conventional programming, such as you’d see in a million helpers in a million programs.  But there’s a lot wrong with this.

Firstly, it’s not very encapsulatey.  In order to use an object like this, you need to know a lot about what a file is, and what states the file abstraction can be in.  If you were to replace this with some other kind of data source (say, streaming data rather than stored data, or hierarchical data rather than sequential data), then all this code would have to change significantly.

A second, more serious problem, is that it undermines the typing system.  Consider a function such as:

Function copyFileTo (conventionalFileReader r, conventionalFileWriter w) {…}

This function knows very little about what r actually is.  Sure, it’s a conventionalFileReader, but what state is it in?  How much initialisation has been done to it – named, opened, what?  (For that matter, what would happen if you opened the conventionalFileReader before setting a name?).  What is CopyFileTo supposed to do if the file is partially read?  Or if it’s already at eof?  Or it’s already been closed?

OK, clearly there’s lots wrong with this approach, so let’s see if a functional approach can do any better.  Remember, the point about a functional approach is that things don’t do anything, instead they are things.  So our immutableFileReader won’t have any verbs at all.  Instead, it will be precisely the contents of the file that it indicates.  Like this:

f = new immutableFileReader (filename);  //Must define it completely at construction time.
foreach (string s in f.lines) //do something with s;
print f; // Informative: in it’s most basic form, it’s just the file contents

I’m sure you’ll agree, this is a much cleaner expression of what we want than the original, and there’s much less to go wrong (precisely because we’ve encapsulated all the state that – correctly – is private).  It’s also a much safer thing to pass around, because the state is properly defined – in fact, there is no state; it just is.  Changing this to work with streaming data would be trivial.

It shouldn’t be hard to persuade you that this is easier for the user of the class.  Isn’t this simplicity paid for by increased complexity inside?[3] Not really:

Class immutableFileReader {
    construct (string filename) { _fileHandle=openfile (filename); }
    // keep the file locked for the lifetime of the object.
    destroy () { close (fileHandle); }
    iterator <string> lines {
        filehandle f = _fileHandle; // use the open file, create new filepointer and buffer
        while (!eof(f)) yield f.readLine();
    }
}

So far, so good.  But what if we want to add attributes to these objects?  It’s not always a good idea to shove everything into enormous parameter lists in the constructor.  What’s more, it may be necessary to add attributes progressively.  How do immutable objects cope with these situations?

The answer is composition.  It works much like the addition over integers that we just saw.  Like this:

f = new immutableFileReader (filename).lineTerm(‘\r\n’).charSet (utf8);
foreach (string s in f.lines) //do something with s;

These look like conventional setter functions, but there’s a difference: instead of changing attributes inside the object, they create whole new objects, like this:

class immutableFileReader {
    immutableFileReader lineTerm (string t) {
        r = new (this); // copy constructor;
        r._lineTerm = t;
        return r;
    }
}

Thus the example above will generate three immutableFileReaders, only one of which is retained (and the other two get swallowed by the garbage collector)[4].

To summarise, then: immutable objects have a number of characteristics:

  • Everything you need to know to initialise the object needs to be in the constructor.  If the constructor returns at all (i.e. if it doesn’t exception), it returns a fully-formed object with its engines started and warmed up, pre-flight checks all complete, and cleared for take-off.
  • There are no setters, (neither property setters nor setter functions) because there’s no state to set.
  • To a first approximation: if you need to adjust the object in some way, you do so by creating a new object with the changed properties[5]five=2+3 doesn’t change the 2, it makes a new 5.
  • Pointer copy, shallow copy, deep copy: they’re all the same.  Objects represent value, not state.

And, just to be clear about this:

  • Objects are not entirely stateless.  They are allowed to have internal state.  An object can change itself, create side-effects, and initialise internal objects and variables.  What is forbidden is to externalise those internal states. From the outside, it should be eternal and unchanging, whatever is going on inside.

The functional programmers are right: when nothing changes, programs are shorter and more reliable.  Immutable objects deliver that approach – at least in part – right into the heart of modern, conventional object programming.  So valuable is it, that Domain-Driven Design[6] mandates that all domain entities are immutable[7].

So, apart from giving us shorter code, simpler utilisation, clearer protocols, and drastically more reliable programs; apart from all that: what has functional programming ever done for us?


[1] Actually, I know that there’s an even more obvious property: that everything is defined in terms of functions.  The way the community describes it is that functions are (perhaps the only) first-class objects.  But what that really means is a topic for another essay.

[2] http://www.hwaci.com/sw/lemon/lemon.html

[3] Even if that were the tradeoff, it would probably be a good one.  It has to be better to write the protocol once (inside the class) than to repeat it at every use.  Furthermore, if we don’t describe the complete protocol inside the class, where else can we describe it?  Making every client responsible for complying with a protocol that isn’t explicated anywhere seems like a recipe for disaster!

[4] Why not use conventional setters, and avoid all that unnecessary construction and destruction? Because we care about value semantics.  Once we’ve given a value to f, we want it to stick.  In the same way that a 2 shouldn’t spontaneously mutate into a 5, f shouldn’t change its attributes.  It should remain what it always was.

But that’s not to say that practical optimisations can’t be made.  It would be perfectly acceptable to modify the object directly when it was known for certain that the intermediate value was to be discarded – that is, we could turn a 2 into a 5 if we knew the 2 was being obviated in the process.  In languages like C# and Java, where all objects have pointer semantics, that’s very hard to arrange (and good garbage collection makes it unnecessary in any case).  But languages like C++ permit explicit assignment operators and smart pointers, and we can keep track of every assignment and de-assignment.  We know when an object is created simply to be disposed (or more accurately, we know when the value of an object is preserved) and we can use that knowledge to do in-place modifications which don’t upset the value semantics.  But that’s a different essay.

[5] This is needed to preserve the value semantics.

[6] Eric Evans Domain Driven Design Addison Wesley 2004

[7] Ibid, pp97-103

Advertisements

One response to “Better living through immutability or What have functional programmers ever done for us?

  1. How to tell if your C team has been infiltrated by closet Haskellers: your codebase has more query-colons than semicolons and your main function starts with the word ‘return’.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s