Exceptions Considered Harmful or: throw is just goto in drag

The two pillars of modern software design are objects and exceptions. Objects allow us to partition very large systems into smaller, comprehensible chunks, and exceptions keep them running smoothly, by protecting them against malfunctions.

Funny thing is: though modern software is huge in comparison to the old-fashioned variety, it doesn’t seem any more reliable. Even though everybody knows that exceptions make programs better, I’ve never yet seen any real evidence that shows any reason even to suspect that it’s true. In point of fact, I don’t believe it, and I rather suspect that in most cases, exceptions make matters worse.

To understand why, we need to look at the origin of exceptions, and what modern exceptions are made of.

At heart, an exception is some mechanism whereby a program can break out of its normal flow, and do something else. In principle, the reason we’re breaking out in this way is because something ‘unplanned’ has happened, and we need to do something out-of-the-ordinary to cope with it.

Since the earliest days of computers, a mechanism to do precisely this has been built into practically every processor. Interrupts handle unpredictable events, like keypresses, timers, and network traffic. The interrupt fires, the processor calls a special subroutine, which does whatever it does and then returns to the point where it was called, with the interrupted process being none the wiser.

But the same mechanism is used to clear fault conditions. For example, there’s an interrupt that fires when the processor addresses non-existent memory[i] – it’s used to handle memory paging, and it works so smoothly that not even the programs that live in the non-existent memory know they’re being swapped in and out. Another example: processors intended for embedded use all have something called a ‘watchdog’[ii] – a timer that needs to be constantly reset, otherwise it times-out and triggers an interrupt. The principle here is that, if the software fails in some way, and the timer isn’t reset, the watchdog interrupt handler can step in, fix what’s wrong, possibly rewind to some ‘last known good’ configuration, and then restart the program. It’s intended to work as an automatic version of a programmer, such as you, halting the code in a debugger and than patching things up before restarting it. In both cases (actually, as with all interrupts), the interrupt processor is called as a subroutine of the main program: it decides whether to return to the point it was called, or somewhere further down the stack.

When structured programming came along, there was not much tolerance for interrupts – structured programming was all about predictability and theoretical correctness. The standard way of signalling to a program that something wasn’t working right was to return error codes. The C idiom was typical of the breed:

int error;
if (error = doSomeWork()) { /* error recovery */ }
if (error = readMyFileFormat()) { /* error recovery */ }

and everybody agrees that this is hideously ugly because it mixes error handling in with the working code, and you can’t see the logic for the trees[iii]. But C provided another mechanism. It’s not very well known, but you tend to see it used a lot in real-time programs (where unpredictability is the point of the exercise).

It had a pair of calls, setjmp() and longjmp()[iv]. The idea here was that setjmp() recorded the program counter and stack pointer, and then longjmp() restored them. The effect was, setjmp() did nothing except populate a handle. But when longjmp() was called, it didn’t return at all – instead the program woke up again back at the setjmp() (although this time, it returned an error code).

jmp_buf v;

void readMyFileFormat ()
   …
   if (allGonePearShaped) longjmp(v, reason);
   …
}

void wrapper () {
   // set-up
   if (int reason=setjmp(v)) {
      // something went badly wrong in readMyFileFormat() - fix it up
   }
   readMyFileFormat ();
}

Now, I’m sure you can see that this technique requires a lot of discipline to not make matters worse (imagine the mess that would ensue if somebody calls longjmp(v) after wrapper has returned!), so there’s a safe (but less-often used) version where the jump vector is passed as a parameter into the procedure that it’s protecting, thus:

void readMyFileFormat (jmp_buf &v) {
   …
   if (allGonePearShaped) longjmp(v, reason);
   …
}

void wrapper () {
   jmp_buf v;
   // set-up
   if (int reason=setjmp(v)) {
      // something went badly wrong in readMyFileFormat() - fix it up
   }
   readMyFileFormat (v);
}

There is something that I want you to notice here: by its nature, by design, this mechanism, when it signals a failure, can not only signal the nature of the failure, but it returns control upstream of the failure point, rolling back execute-time so readMyFileFormat() can have another shot. The error-handling code describes what needs to be done in order to return the system to a condition in which readMyFileFormat() has a clear run (if possible). There’s a choice here, just as there is with the watchdog interrupt: try again or bail out.

Incidentally, this mechanism is even more elegant (and safe) in old versions of Pascal. Look at this:

procedure wrapper ();
label v;

   procedure readMyFileFormat ();
   begin
      if (allGonePearShaped) goto v;
   end;

begin
   {set-up}
v:
   if cleanUpRequired then begin
      {clean-up code}
   end;
   readMyFileFormat();
end;

Again, Pascal gives you the choice to resume upstream of the cleanup, or downstream.

Now, I can hear what you’re saying: “Arrgh, a goto! You barbarian!”, or “You’ve violated the encapsulation, you heathen!” or “You can’t just discard the stack like that, you imbecile! How do the objects get destroyed? ” But don’t complain too loud. The only information that needs to pass from wrapper() to readMyFileFormat() is the resume vector, and the only information that needs to pass back again is the nature of the error. Within the terms of these two programming languages (neither of which have objects, so neither of which have to call destructors), it’s actually a very clean mechanism.

The first time I saw a modern-style exception, it was called “posit”,[v] and was actually an assertion coupled to a goto. Inside a posit block (posit means “let it be supposed”), there was a program block, followed by one or more denial blocks, each block was branched to by a deny statement. It looked like this:

posit canReadFile
   if not readMyFileFormat () deny canReadFile:didntDoIt
   …
   { goto end of posit is implied }
denial didntDoIt:
   { clean up code here }
end posit canReadFile

Posit blocks could be nested. Furthermore, posit labels could be passed as parameters, so inter-procedure jumps were possible. It was very similar to the second longjmp() mechanism we saw earlier:

posit canReadFile
   readMyFileFormat (canReadFile:didntDoIt);
denial didntDoIt:
   { clean up code here }
end posit canReadFile

Unlike the longjmp() mechanism, posit achieved a key objective: to take error-handling out of the main program flow. (Trouble was, of course: by separating the recovery code from the main flow, it made the flow more clear, but it obscured the error handling!) But it set one, lasting precedent: the recovery vector was downstream of the fault. We’ll explore the consequences of that in a moment.

Superficially, VB’s error handling mechanism[vi] looks like like this; on error goto label transfers control to a special error-handling section of a subroutine. But actually, used properly (and, I’ll admit, frequently it wasn’t), it’s much more like an interrupt: when used after an on error goto, resume returns to the start of the statement that caused the error, or resume next returns to the following statement. It’s only if the subroutine ends without a resume that the main process is ended prematurely.

Eiffel, the language that introduced the world to contracts, has an exception mechanism[vii] broadly similar to VB’s. It says: there is only one way for a procedure to operate correctly: that’s by starting at the top, and running to the end without any malfunction. If a malfunction happens, then the exception handler for that scope is called (it’s called rescue). And that handler can do only one of two things: it can do whatever patchups it wants, and then restart the procedure from the beginning, or else it falls off the bottom which triggers an exception in the containing scope. That’s it: the procedure either works, or it doesn’t. It never destroys scope until the scope is unrecoverable. It’s not quite the ‘asynchronous procedure call’ that an interrupt tries to be, and it doesn’t have the fine-grained control that VB does, but it does at least give the procedure a second chance.

Java’s exception mechanism is a generalisation of the posit: every exception that is thrown (they’re exceptions now, not interrupt vectors or posits) returns in the same procedure. What’s new is that if you don’t provide a handler (or if you re-throw it), the exception is automagically passed to the immediately containing scope. But you can only do so if you’ve declared in the function’s prototype, using a throws clause, that the exception is passed out.  This means that it can use the standard procedure return mechanism to unwind the stack. (Incidentally this is the reason why you need a throws declaration – the set of exceptions a procedure throws is actually a part of the procedure’s calling protocol – exactly as if the exception handler’s label was a passed-in parameter!  Sound familiar?)

C++ and C# don’t do it quite like that. Their exception mechanism involves calling a debugger-like supervisor (just like the old watchdog interrupt handler) which scans backwards through the stack, looking for a frame that’s prepared to catch the interrupt. As it goes, it deconstructs the stack, frame by frame (and C++ has the additional burden that it’s got to call the destructors of all the automatic variables as it goes).

So far, the modern exception mechanisms look largely like posits, except for C++/C#’s ability to scan though the stack. But actually, they’re significantly inferior to both posits and longjmp()s, as the following example shows:

Imagine: you write a procedure fileReadMyFormat(), and it calls file.open(), and it calls… and then, months after you’ve released the code, without any warning at all, a SocketTimeoutException() bubbles past you. Where did that come from? (Actually, it came from an update to the remote file access code, after your program was finished, but you don’t know that yet.) What on earth are you supposed to do about it? Basically, this wasn’t your fault, and you can’t fix it, and this signal shouldn’t be anywhere near your code.

Incidentally, Bob Martin claims[viii] that the debate between checked exceptions (Java style) and unchecked exceptions (C# style) is over, and unchecked is better precisely because it doesn’t violate encapsulation, whereas laboriously declaring every thrown exception does. Bob: it’s not over. This one case demonstrates that unguarded exceptions drive a coach and horses through encapsulation. At least guarded exceptions can be statically checked.

Back to exceptions. What’s wrong with this picture? Two things:

  • The exception handler is always downstream of the fault – the catch block follows the try block, every time. That means, to a first approximation, you have to live with the fault that caused the exception – unlike the watchdog interrupt or the longjmp() mechanism, (or even VB’s crude on error goto) you don’t get another bite of the cherry. All the exception handler knows is what kind of fault has happened, not where or why. It’s not possible to fix something up (say, ask the user to re-connect the network cable) and then resume, because there’s no resume frame, because we’ve already started unwinding the stack before we’ve started to fix what’s wrong. (And that’s true even where the catch is in the same procedure as the throw, because catch comes after try, and we can’t even tell whether the exception came from the same procedure or something that it’s called.)
  • Secondly, there may not be an exception handler that can do any good at all! In order to call longjmp(), somebody, somewhere, must have setjmp()ed the jump vector. That is, whoever has to field the longjmp() must give permission for somebody else to throw it. Similarly, in order to deny a posit, somebody must have posited the posit first. That’s not true with exceptions: the thing that throws the exception doesn’t need to care (in fact, can’t even know) whether anything else is going to catch it (even Java has unguarded exceptions). It just throws the exception, blindly, and whether its caught or not is not the thrower’s responsibility. It’s like leaping off a building without checking whether there’s a safety net. Sometimes, it may be appropriate, if meaningful help is not forthcoming, to attempt to carry on in some ‘limp-home’ mode. But once this type of exception is thrown, that’s it: you don’t get a second chance.

Taking both these into consideration, it seems to me that we’ve got less encapsulation, less resilience, and less reliability than we had in the days of interrupts and longjmp(). Today’s throw is little more than goto hopeForTheBest. Even VB did it better!

The problem with all these exception models – every one – is that they have all fundamentally misunderstood the job they’re doing. The purpose of exceptions is to keep a program running to the best of its ability in the face of unexpected emergencies. For an exception model to have any validity, it should detect a malfunction at the earliest possible point, and correct it so that the program can keep going as planned – just as Eiffel asserts. None of these mechanisms does that.

To see why not, let’s revisit the file-opening example again. The remote file system object lost its network connection. What’s it supposed to do about that? Answer: it can’t possibly say, because it has no idea why or by whom it was called. So, it has to refer to its calling context. In every one of these models, in the course of referring, it forgets its own identity; C#/java exceptions start by unwinding the stack, and even though VB/Eiffel exceptions get another shot, they have to be self-reliant when they do so. Only two mechanisms are immune from this: longjmp(), because it rolls back time to before the malfunction happened, and the simple, crude interrupt, because interrupts are always called like procedures.

Eiffel and longjmp() do have one thing right: there should be two ways out of an exception handler: take another shot, or bail out. It would be perfectly easy to add this to our modern exception mechanism: call the exception handler, as a subroutine, from the point where the exception happens. To return to the exceptioning point (a la VB’s resume) simply resume. Only if the exception handler runs off the bottom should the stack be unwound. Like this:

try {
   readMyFileFormat();
} catch (socketTimeoutException e) { // called as a subroutine of RemoteFile::readlock()
   if (messageBoxYesNo ('Can you plug your network cable back in?'))
      resume:retry; // tell the throwing process to retry
   alert (“Then I can't open the file”);
   resume:repair; // optional: tell the throwing process to stop trying, and return a null object.
}// if no resume in the body, stack unwinds at this point.

Under this regime, the default exception handler would do nothing, and resume.null, and the exceptioning routine would have to sort itself out as best it can.

The advantages of this are: First, an erroring procedure can ask a parent context what it should do about its error, and then act autonomously on that advice. If there’s no meaningful answer it can revert on its own account into some trivial (invariant-preserving) state. Secondly, an error state can be recovered right then and there – a procedure which was otherwise doomed to delinquency can now be rescued and can carry on like any other good citizen. Finally, because in all but the most severe conditions, even an erroring procedure will exit through the normal procedure return mechanism, there’s no need to worry about exception safety because most exceptions don’t cause any abnormal exits. In short, even in the face of abnormalities and malfunctions, the program continues to execute just as it was designed to.

Though it was created with the best of intentions, the modern exception has lost its way. If you’re relying on exceptions to get your program out of difficulty, you’re probably going to be disappointed, just the same as if you routinely goto your way out of trouble. Though the ancestors of exceptions – interrupts and longjmp()s – worked just fine in their own terms, exceptions have drifted away from the problem they’re supposed to be solving – it’s evolution in reverse.  What you really want is early detection of errors, and graceful degradation, so that you can keep your program running the way it was designed to.  But what exceptions now do, what they’re best for, is emergency tear-downs, which is very rarely what you want. In consequence, filling your code with exception handlers won’t make it any more resilient; it’s more likely to embrittle it.

In my experience, code that’s full of catch blocks is today no more reliable than code that isn’t, and frequently is less so because (a) the exception handlers can turn minor malfunctions into disasters (how tough is it to write consistently ‘exception-safe’ code?), and (b) burying a program in catch blocks is often used as an alternative to proper specification and testing – to coin a phrase, exceptions are the first resort of the cowboy[ix]. If what you’re looking for is reliable, resilient, failsafe code, there are actually better ways to case-harden a program than exceptions. Until they put a return into handlers, exceptions will continue to be, by any objective standard, unfit for purpose.


Footnotes

[v] Richard Pickard “Programming Structures of the Fourth Kind” in .EXE 5(2) July 1990

[vii] http://docs.eiffel.com/book/method/et-design-contract-tm-assertions-and-exceptions#Exception_handling. Also see Bertrand Meyer “Eiffel: The Language” Prentice Hall 1992

[viii] Robert Martin et al “Clean Code” Prentice Hall 2009 p106-7

[ix] “In Dr. Johnson’s famous dictionary patriotism is defined as the last resort of a scoundrel. With all due respect to an enlightened but inferior lexicographer, I beg to submit that it is the first.”—Ambrose Bierce, The Devil’s Dictionary, at entry for patriotism, The Collected Writings of Ambrose Bierce, p. 323 (1946, reprinted 1973).

Advertisements

7 responses to “Exceptions Considered Harmful or: throw is just goto in drag

  1. You’ve painted quite a horror story, but if exceptions are properly used, they increase, not decrease reliability.

    Following a set of rules that fit on one slide results in code that is easier to read and write with no performance trade-off and 100% reliable in the face of exceptions. Check it out:

    http://exceptionsafecode.com

    • Indeed: anything used properly will increase, not decrease reliability (yes, even checking return values, or longjmp() ). But, seriously, in the real-world (and especially in C++, which seems to be your focus), how often have you seen it done wrong? C++ doesn’t even provide the basic tools you’d need to check exception-safety (such as naturally smart pointers, guarded exceptions, and so on). The net result is that most programs (whose programmers haven’t had the benefit of your august advice) have exception-handling strategies that make bad situations worse. You wouldn’t disagree with that, would you?

      You note that your rules fit on one slide (Not sure which one of the 109 slides in your presentation you mean – 39, 40, or 41, but OK 🙂 ). By your own admission, it’s really hard to adhere to those, and actually, its really hard even to know whether you’ve adhered to them in any formal or testable way. Further: by their nature, it’s impossible to actually test exception mechanisms in the same way that you test the primary flow, because, by their nature, exceptions are exceptional. Correct exception handling demands exceptional discipline.

      But, this is all by-the-by. Even if we follow your rules, and even if we can construct properly-failsafed exceptions, my question is: are the exceptions actually doing a useful job? Do you feel that the most sensible thing to do, when faced with a fault condition, is invariably to tear down the stack, destroy all the intervening contexts, and pick up the pieces downstream? Do you really not believe that, sometimes, there is something a bit more constructive that can be done?

  2. I think the trick is not to invent so many error states in the first place. Nowadays, if I initiate a program myself, I’ll probably get through it without this issue ever even arising.

    Let’s start with the proposed example: somebody pulled a plug in the middle of a network transfer. By what logic is this considered an error? You probably don’t even know that the plug was pulled – it might just be lag. Whether the plug was pulled or not, if it’s reinserted within a reasonable time period, then TCP will simply continue where it left off. If you really care about that plug being inserted all the time, you probaly have a completely different machanism in place to make sirens wail at anybody who pulls it out.

    I once worked for a company who were shifting huge amounts of rapidly changing data over a very large network. They’d made it harder for themselves by dividing the core functions into several separate computers although one would have done, and ironically, they thought they’d done so for performance and reliability reasons. This meant that all these core machines had to duplicate each other’s state and stay in sync. This syncing accounted for most of the traffic.

    Because they wanted to feel that this sync was working properly, at the first sign of lag, they’d panic and terminate all the connections and try to rebuild them. Having rebuilt them, they figured they had to re-sync the entire mass of data. That was where all the traffic was coming from. The final irony is that all this data was getting updated every 30 seconds, which is considerably faster than this re-sync took, so they’d have got back in sync faster if they’d done nothing at all. They were basically stitching themselves up in a self-fulfilling prophecy.

    I told them to use UDP. TCP can tell a sender if the receiver is not responding, but what was the sender going to do about it anyway? Switch to a backup server? Well in that case the backup server should have been getting informed of the data all along, perhaps with multicast. It wouldn’t be up to the sender to decide if the first receiver was behaving itself or not. That would have to be a sanity check on the receiver’s outputs, perhaps by comparing with the other one. The sender is certainly not supposed to snub the receiver based on ill-informed suspicions.

    They took my advice, and guess what they learned from it? That there never was an evil little boy hiding in the cabinet pulling cables out for fun. The world’s network infrastucture was not simply having days off without warning either. It was a figment of their imagination that they broke their system over.

    But surely there are such things as error states. Can we think of one?

    Suppose a graphics library gets a request to draw a pixel off screen. Be creative! Maybe you want wrap-around, maybe you want clip, but why call it an error? Somebody might find it useful. I once accelerated an embedded graphics library by a factor of 60, simply by deleting the range checking that was happening at every single level of the stack. Those are IFs! An IF is a pipeline break on all but the fanciest processors. It never would have done any harm to tell the hardware to draw offscreen anyway.

    Suppose the user presses the wrong button. Well what was the button doing there in the first place? Can’t you think of anything useful for it to mean?

    What about running out of memory? We’re all supposed to check our mallocs aren’t we? Well I try very hard not to call malloc at all. I prefer to decide at design time how much memory I want to spend on what, and I lay it out in advance. I also think about where to put it to minimise cache misses, and I might have several allocation mechanisms tailored to the particular things I need to allocate and when they’ll need to be accessed.

    That works on a box that won’t be doing anything else, but obviously if you’re writing an app for a multi purpose machine you’ll want to malloc on demand. But in that case, the machine probably has a thrashing mechanism and a address space that could label every atom in the galaxy, so malloc is simply never going to say No. Imaginary ghouls again.

    I often see people write chains of subsystem initialisation functions. They check the return value of each before proceeding to the next one, and ruminate about whether some kind of exception mechanism would be better. This is the funniest example of all, because at initialisation time, you’re starting from a known state, so it’ll either always work or never work. I’d never need such a function anyway because I’d statically initialise whatever it was. If my program needs to initialise itself by talking to another computer, I just call that main().

    There are of course some genuine error states in the real world. They’re called bugs. If you show me an exception mechanism that can detect, diagnose and fix the bug in the field, especially without unwinding the stack, I’m interested. In the meantime, I don’t need any of this stuff.

    • Much to respond to, here, but let’s start with your very first point: “somebody pulled a plug in the middle of a network transfer. By what logic is this considered an error?”

      Well, here’s some suitable logic. You’re talking to a remote file, and now you can’t carry on talking to it. That would seem to be a bit erroneous”

      But your question, and my response, are missing the point of exceptions. Exceptions don’t represent error states. They actually represent things behaving exactly as they should (a network connection might go down, and it’s a perfectly valid function of the network-talker component to detect that). But as a result, it now can’t honour requests made of it (such as, to read the next block of bytes). It’s not an error condition – if anything, it’s a contract violation – but it’s not what the client program is really about (the client program is thinking in terms of files, not network protocols). So, it’s not an error state, but it is an abnormal or exceptional condition. Hence the name: exception.

      The kind of approach you describe is what is used in highly-contained or high-reliability programming – PIC programming has no need of exceptions over and above a simple watchdog (and VHDL doesn’t need even that), and AIUI exceptions, interrupts, and mallocs are banned from mil-spec programming (for exactly the reasons you outline: in a finite processing model, malloc may fail). And yet, every system has variances and interactions: even in these highly-specialised cases, some kind of unnormal-case processing is frequently necessary, and has to be hand-crafted.

      And so, modern languages, even langues like C++ which are intended for system-level programming, come with memory allocation, object construction and destruction, and exceptions built-in. It just isn’t practical, in today’s programming ecosystem, to deny that these things have any utility or validity at all! (Or, maybe it is practical, but I think it would take a much longer argument to establish it than you’ve given here!)

      You’re right about one thing though: with fewer abnormal states, there is much less need for exceptions. I’m planning a post about precisely that.

  3. This is one of the better discussions about exceptions I’ve encountered. In the large I agree with the author (Jules), rather than the first commenter (John).

    I also like the comments from Adrian, which moves closer to the core problem of defining, or rather understanding, what is an ‘error state’.

    Often missing in the exception debate is the context. What is an error is contex dependent, and part of that context is the domain being modeled or otherwise subjected to software. The most serious problem (but not the only one) with exception code is the misapplication of the exception mechanism to handle domain logic.

    Exceptions should only apply to situations that violate the abstraction which is being implemented. Given the network outage example, if the domain is to implement TCP or a similar protocol, the abstraction is one of a reliable transport, and it is ‘normal’ for this abstraction to be concerned with when the the physical link ‘breaks’. That’s the whole point of TCP, so re-transmitting a packet is not ever an ‘exceptional’ situation.

    On the other hand if some implemenation was trying to present a remote resource as though it were local, perhaps network problems could be regarded as exceptions, forgiving for a moment the folly of pretending that remote and local can be treated
    alike.

    All too often, developers, including libaray and API developers, impose their subjective favored outcomes or use cases into the abstraction, and regard others as ‘errors’ or ‘exceptions’. This is what is truly insidious, and is a type of intellectual laziness which is increasingly common in an environment which regards programmers with 2 or 3 years experience as ‘senior’.

    A widely seen example of exception code involves some kind of authentication object or service, which treates an incorrect password as an exception. This is so deeply and utterly wrong that anyone who doesn’t immediately get it is simply not qualified to be programming.

    Bad credentials during authentication is not an exception — it is one of the primary responsiblities of the service to handle correctly. In fact, it is probably more important than the case of correct credentials: blocking the unauthorized is precisely why anyone would even want an authentication service in the first place.

    Basically, there are very very few situations where the exception mechanism is approriate. Like ‘goto’ itself, it is a bad code smell — not always wrong, but usually deserves scrutiny and skepticism.

  4. Pingback: If statements considered harmful or Goto’s evil twin or How to achieve coding happiness using null objects. | de Programandis

  5. I’ve always had trouble deciding what a good model for handling exceptions would be. Should I handle the exception immediately, should I handle it at a higher level, should I not handle it at all, or “it depends”, which is not very helpful as a pattern.

    Part of the problem is, again, context. If I do not handle an exception immediately, and more specifically, if I do not wrap every function call – explicit or implicit, in the case of operator overloading and copy constructors and what-not – with a try/catch block, how do I know where the exception came from? An “invalid argument” exception is a great example. Did I just detect a programming error; i.e. fail an assertion, in effect, by violating a function’s contract, or did the user type in something that I had no way to know without calling the function would be an error? Which argument? What are the valid values? If I don’t wrap each line of code with try/catch, where did it come from?

    Checking an error result from a function eliminates most of those issues. Yes, the code becomes a series of “one line of desired behavior, ten lines of handling undesired behavior”, but I know it’s been handled and handled correctly for that situation.

    A program where every line of code could be aborted at any point inside it is a program that is very difficult to make behave. There are an infinity of gotos coming out of my code. I thought exceptions were supposed to *reduce* the cost of writing correct code, but the analysis required to make sure the code’s correct is much higher.

    Forget about maintaining this level of analysis in the face of schedule pressures, customer bug reports, and just general code maintenance done by multiple programmers. If a detailed analysis has to be made of every piece of code every time it’s touched, why not just code in assembly language?

    I worry about these things because the software I work on is for an “appliance”, if you will. The product is not used internally. It is placed anywhere in the world, with no control over who is using it and how, and without any hope of experienced personnel nearby or immediately reachable by phone for support. It is expected to simply work. Like a brick. If you pick up a brick, you know for certain it will always be “heavy”. If you throw a brick, you know for certain it will always “break something” it hits.

    Given the difficulties of writing exception-safe code; i.e. that it seems to be something beyond the abilities of even smart human programmers (not that they are flawed, they are *human* and have capacity, complexity, and other kinds of limitations), every time an exception is thrown, it threatens the ability of this product to continue to function reliably. I can tell you from the problems reported by the owners of this product that it fails to function reliably more often than they or we want it to, and when it does fail, it is frequently impossible to figure out why. Exceptions are not making this easier.

    On the subject of interrupts and history, the PL/I programming language designed in 1964 by IBM handled “conditions” (their terminology for exceptions) by an ON statement. This worked almost exactly like an interrupt. The ON statement specified a statement or block to execute when the given condition was raised. The “ON unit” could execute (nearly) any statement in the language, including non-local GOTOs. Initially, every condition was assigned to the system action for that condition. ON statements let you change that.

    There were a number of oddities, though. ON changed condition handling at runtime, not compile time. It acted like a stack of handlers for the condition, of which only the most recent had any effect. An ON statement ended when another ON for the same condition was executed or the block in which the ON statement was executed ended. There were conditions for things that were not, in my view, exceptions; all files have an end of file, so it’s not an exceptional condition. Some of the system actions were counterproductive; reading past end of file closed the file, but reading a file that was not opened would open the file, causing an infinite loop in naive programs. Sometimes executing past the end of the ON block would behave like RESUME in BASIC, sometimes like RESUME NEXT, depending on the condition.

    In the environment I worked in, most uses of ON simply handled ENDFILE conditions: ON ENDFILE(CUSTOMERS) DO; EOF = 1; END; where EOF was a variable that could be tested by a subsequent loop. But ON could be used for much more sophisticated recovery and patch-up if you wanted. There were variables such as ONCHAR which could be used in CONVERSION conditions to modify the character that couldn’t be converted (during a conversion from string to numeric data) so that the conversion could be retried and succeed, and ONLOC which would tell you in which procedure entry point the condition was raised.

    I’ve read a lot of complaints over the years about the ON statement. Many computer scientists complained it was too limited (part of its appeal to me) and a mish-mash of ideas (well, yes, it was that).

    Thanks for your article, and thanks for listening!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s