Errors considered harmful

Python has a place for “meandering”: Mailman 3 Python-ideas - python.org
Typically, ideas are thrown out there, brutally dissected, and – if they survive the process – written up as a PEP. Also, these days, PEPs typically require a “sponsor”.

I keep having to think about why I don’t see this as an issue. Maybe it’s due to past experience that I make minimal assumptions about success and failure (and now exceptions, I suppose). Maybe it’s because I try to test almost as much for failure as for success. Maybe my problems just don’t look like yours. I just don’t know.

There’s also a distinction between deterministic, i.e., leaves no choicepoints, and deterministic, never expected to fail. In the latter case, I might even choose to wrap it in another predicate with a “base” clause which turns it into a “noisy” failure if that’s an unexpected event; sort of equivalent to an assertion that failure is not an option.

Now this would scare me. When I get a predicate that’s larger than say ten clauses, I look for a way of breaking it down into smaller independent “develop&test” chunks. And doing so ensures that I know how to add extensions at some future date. I don’t quite understand why that doesn’t apply to nodes in an AST, if we’re talking about the output of a parser. Surely the parsed language would have a subset that is testable as an independent unit? But I’m talking in hypotheticals here so I shouldn’t generalize.

Examples from clpBNR:

  • narrowing_op/4, basic unit of relational interval arithmetic, 35 clauses for 24 operations, discontiguous, ~600 loc with “helpers”, deterministic (may leave incidental choicepoints), can succeed or fail by design
  • partial_derivative/3, symbolic partial derivative, 17 clauses, ~130 loc with “helpers”, deterministic, can succeed or fail by design
  • simplify/2, reduce/ simplify arithmetic expressions, 10 clauses, deterministic, always succeeds (to implement all of simplify/2, ~375 loc)

simplify/2 looks “simple” but is anything but - several levels and recursive calls to simplify to handle subexpressions and function arguments, for less than perfect results. But it’ll always succeed (by design) even if it does nothing,i.e., the output is a reconstructed copy of the input. (I do keep thinking there must be an easier way.)

Anyway, the point is I can’t imagine doing these kinds of predicates as all-or-nothing. I’d try very hard to break them down. If that really wasn’t an option, I’d wrap the predicate and provide a base clause for exceptional circumstances, since there are bound to be lots of these early in testing.

Since I don’t have a strong functional programming background, I’ve never found much use for these; perhaps the occasional maplist. For me, they confuse more than they help; processing a list to produce another list is the least of my problems. But it is a matter of style/preference.

When a http client makes a request to server it can give mime-types that kind of generalize the types MIME alternatives. Http traffic is not compiled.

Well if the problem is that code has to deal with unexpected types dynamically, maybe those types could be indeterministic, sort of backtracking and trying out some type conversions as a built-in error recovery (and giving out warnings). And if the built-in recovery fails then throw exception.

Don’t get me started … oops, too late!

I think I’ve mentioned before that I’ve never read the standard since they charge me for the privilege of reading it (counter productive IMO). So to me it’s largely irrelevant, and probably becoming more so as time goes by. What is relevant is what’s implemented (the living standard). I suppose the standard is relevant to Prolog implementors, so that dialects don’t diverge too much.

I did implement some of the arithmetic evaluation in SWIP when rationals became first-class numbers and took some care to ensure that mixed mode arithmetic produces as precise values as possible. But of course ISO probably doesn’t mention rational numbers.

Unbounded ints definitely cannot be considered as subset of floats. In fact, floats are a subset of rationals.

For the record:

I’ve done a quick experiment on using goal expansion for length/2 and arg/3:

goal_expansion( length(List,L), 
                ((var(List) -> true ; is_list(List)), length(List,L))
              ).
              
goal_expansion( arg(Arg,Term,Value), 
                ((compound(Term),((integer(Arg),Arg>=0);(var(Arg)))), arg(Arg,Term,Value))
              ).

The objective of the goal expansion is to precede the builtin with a guard (no unification) which will fail rather than generate an exception. (It’s interesting that the same approach could be used to turn builtin failures into exceptions for those concerned about silent failure.)

Both of these cases require guards with unfortunate performance impacts. length/2 requires is_list/1 since malformed lists cause exceptions. is_list incurs the cost of an additional builtin call whose performance is O(L) since it has to scan to the end of the list. (Note that length is already O(L).)

arg/3 requires an arithmetic compare (Arg>=0) which is a builtin call unless the flag optimise is true, which I suspect is rarely the case, but I might be wrong.

The difference in performance can be significant. A test goal which just calls the builtin (test_null is base case which just calls true) without expansion:

?- time((between(1,1000000,_),test_exp:test_null(_,_,_),fail)).
% 2,000,000 inferences, 0.088 CPU in 0.091 seconds (97% CPU, 22748988 Lips)
false.

?- time((between(1,1000000,_),test_exp:test_length([1,2,3],_),fail)).
% 4,000,000 inferences, 0.203 CPU in 0.207 seconds (98% CPU, 19719004 Lips)
false.

?- time((between(1,1000000,_),test_exp:test_arg(1,f(1),A),fail)).
% 3,000,000 inferences, 0.125 CPU in 0.126 seconds (99% CPU, 24023639 Lips)
false.

Same test with guarded calls:

?- time((between(1,1000000,_),test_exp:test_null(_,_,_),fail)).
% 2,000,000 inferences, 0.080 CPU in 0.080 seconds (99% CPU, 25150905 Lips)
false.

?- time((between(1,1000000,_),test_exp:test_length([1,2,3],_),fail)).
% 5,000,000 inferences, 0.245 CPU in 0.248 seconds (99% CPU, 20433100 Lips)
false.

?- time((between(1,1000000,_),test_exp:test_arg(1,f(1),A),fail)).
% 5,000,000 inferences, 0.211 CPU in 0.213 seconds (99% CPU, 23747103 Lips)
false.

So length is about 35% slower on a fairly short list; it will get somewhat worse with longer lists. arg is over 3 times slower. I would be happy if anyone could point out any flaws in this implementation that would reduce the costs (size and execution time).

So I’m not going to pursue this further. It’s too high a price to pay to guard against the general case when some or all the guard is redundant in the local context. And it’s certainly redundant in existing code which is already guarded due to process of removing exception conditions in running code. (What’s a little irksome is these are exactly the same checks the builtin already does to detect the exception condition.)

Not really the standard, read the first paragraph. A short excerpt:

The information given here is only a sketch􏰗 anyone needing de􏰂nitive details is urged to consult the ISO do cuments themselves􏰁

Fair point, here’s the new version:

goal_expansion( arg(Arg,Term,Value), 
                ((compound(Term),(var(Arg) -> true ; (integer(Arg),Arg>=0))), arg(Arg,Term,Value))
              ).

Retesting:

?- time((between(1,1000000,_),test_exp:test_arg(1,f(1),A),fail)).
% 4,000,000 inferences, 0.195 CPU in 0.196 seconds (99% CPU, 20558788 Lips)
false.

So now just a little over a factor of two slower (on this argument pattern).

And this ISO description for when it succeeds is just fine. Doesn’t this description pretty much describe what all the dialects do? But when it doesn’t succeed, I’d like it to (optionally) fail as efficiently as possible, not generate an exception. As you point out, ISO doesn’t seem to have much to say about that, one way or the other.

Conditional compilation is now a dead end for me - too complicated and too expensive - and even less likely to be portable.

It seems we’re just rehashing the same old arguments. So let me repeat Richard O’Keefe’s point regarding type errors (some paraphrasing this time):

The third kind of error is when a predicate is given arguments of the wrong type.

If you regard a built in predicate as an implicit form of a possibly infinite table, it is clear that logically the response to such questions should
be simple failure. If for example, we ask ?- plus(X,a,Y), it is clear that there are no substitutions for X or Y which could ever make this true, and the appropriate response is simply to fail.

However, for debugging it may be more helpful to forget about Prolog’s “totally defined semantics” and to produce an error report. … The exact form of the error message is not specified, as a Prolog program should have no way of getting its hands on such a message …

An implementation should be able to provide both forms of behaviour at the programmer’s option. This could be a global flag set by e.g. flag(type_fail,_,fail) or flag(type_fail,_,error).

I just happen to agree with him, ISO standard and existing implementations notwithstanding.

1 Like

It would be interesting to have Richard’s current opinion on that. This must be really old. It seems to predate exception handling as “a Prolog program should have no way of getting its hands on such a message”. In the old days (C-Prolog, DEC-10 (?), predicates typically failed while printing some error/warning on invalid input). Quintus was the first system (AFAIK) to introduce exceptions and Richard was there. Also in the various discussions concerning ISO and SWI-Prolog he seemed quite keen on exceptions (only disliking the name and argument order of catch/3) and I never heart him (again) about this flag. Yes, I am aware of this view on type errors. It should naturally extend to domain errors. I don’t think I ever heart anyone claiming to extend failure even further.

To me though, it goes in the wrong direction. Sure, adding such a flag is not too hard. As a result a lot of code will get strange results varying from failure to wrong answers when passing data that doesn’t satisfy the type or domain constraints. The failure was expected, but the wrong answer is bad. A failure in a place where it was not expected by the programmer (because the called predicate is det, so it succeeds or throws an exception) just doesn’t work on real-life Prolog with negation and cuts. That, combined with the common knowledge that global flags are poor design is enough reason not to go there.

It surely didn’t have that when I used it :slight_smile: But yes, in addition to printing a message most (all?) errors started the debugger. SWI-Prolog did the same for a long time. I think exceptions came with the evolving ISO standard. The debug_on_error flag is still a remain from these days causing the system to check whether an exception is caught or not and, if uncaught, start the debugger while keeping as much as possible of the error context in place.

Not quite sure why this is so hard; perhaps I’m just delusional. For the record, I’m not against exceptions in general. What I am arguing for is exactly what Richard mentions: if a predicate, particularly a built-in, has no chance of ever succeeding given the arguments, then I’d like to at least have the option of having it (efficiently) fail. So the main culprits are type and domain errors, but there may be a few others in the same category.

(Instantiation errors are a different category. IMO it would be nice to support a mechanism whereby these could be mapped to constraints by an application.)

This might be true (hard to generalize), but I find it hard to believe this is common in working code. This would require using catch expecting to get type/domain violations instead of failures. Furthermore the handler would have to turn such violations into succeed to continue the forward execution (so as to generate wrong answers). Does anybody really catch type/domain errors and not fail (or abort)? Now that sounds dubious to me.

Put another way, how does failure produce a wrong answer other than through errors in programming logic? No exception in the world that I know of helps with those.

Now to me this is just wrong. It’s not logic programming, and no builtin predicate should do this. If a programmer decides to do this, that’s his choice, but it’s a questionable one IMO.

I don’t understand why this is relevant to the discussion. Surely negation and cuts are all about (the control of) failure. I do think I’ve done a fair amount of “real-life Prolog” that does work with builtins that naturally fail.

By inference SWIP is poor design; it has numerous such global flags, and no plan that I can see to get rid of them. Now I really don’t believe it is poor design, but it’s too far down the slippery slope to do much about it. Any progress towards a better LP language probably needs a clean sheet of paper.

I haven’t yet seen any strong arguments against my position - perhaps some debate on mechanisms - but I accept it’s not going anywhere, so let’s drop it. I’m going to spend my time trying to define a small “logical” core (minimal set of builtin primitives). Push everything else out to “Prolog” libraries where dialects can implement whatever policies they like. Not sure where this will end up; I’ll find out when I get there.

“Errors in programming logic” is the major concern in software development. Those errors happen in the design phase and in the implementation phase.

Run-time exceptions will catch some of the unexpected conditions caused by errors in design and implementation. Ideally, such errors should be caught by the compiler and the linter, but it is not always obvious how to do that.

But there is already something of a precedent in the “Hacker’s corner” I thought? Doesn’t exception/3 there offer a similar feature, with fail in the last argument?

Long thread! So bottom line if I want an efficient failure-oriented is/2 I need to be very careful and check all the variables being fed into it are ground numbers. Did I miss a trick?

I guess so. That is not different from just about any Prolog code that you would like to backtrack in case of errors. I think that is fine. There are surely use cases where you want to backtrack on certain errors, but IMO that is better handled using careful and controlled techniques such as type-checking and possibly using a catch/3 at places. Global flags are bad in general ans this isn’t really benign. Notably the negation of an exception is an exception while the negation of failure is success.

FYI

GOTO 2018
The Do’s and Don’ts of Error Handling
Joe Armstrong (YouTube)

I’ll venture to chime in with my 2 cents:

I see true and false (failure) related to the domain semantics – its part of the mathematical description of relationships in the problem domain.

And, exceptions are, when, during execution, something happens that is out of bound of the mathematical description.

E.g. a predicate is representing a function, i.e. its supposed to be determinate (det/1).

The predicate is called and concludes with a choice point – this is unexpected from the intended mathematical description that aimed to indicate that the relationship in the problem domain is functional.

Another example – design by contract:

E.g. a predicate maps (models the relationship) from a non-borrowed Book in the library to borrowed book, that is not in the library.

The predicate is supposed to be called with a book in the library and with a library that has at least one book. Otherwise, the meaning of the relationship is, mathematically speaking, undefined.

If its called with a book not in the library, or with an empty library, then the predicate contract (from a technology point of view) is violated (indicated by a guard?) and, no correctness guarantees can be given (i.e. all bets are off) for the result – it thus can’t be false.

An exception, as a programming device, is used to indicate that. Failure – would be inappropriate to capture a failed contract.

Dan

Edit:

From the above it seems that guards can play two roles:

  1. test for contract fulfillment by the caller – which could lead to an exception
  2. select a predicate clause, when several clauses, for example, indicate different domain cases, for a predicate.

I have no problem with an application making these kinds of decisions. My issue is with built-in primitives which leave the application no real choice in the matter. The core language should provide mechanisms for implementing policy; not dictate policy.

In other words, who gets to define what a “failed contract” is?

I’m particularly focused on built-in’s because they typically provide functionality that can’t be readily, or efficiently, implemented in Prolog. I always have a choice as to what libraries I use.

Sorry for perpetuating this overly lengthy discussion when I said I wouldn’t.

Thanks for clarifying – yes, that is indeed interesting – i missed that emphasize of yours.

So, i guess, having two versions could take care of this, like someone mentioned earlier, e.g. an is – which is the default fail behavior, and, say, an *is – that would raise an exception rather than fail.