Wow. Tabling turns a non-LL(1) grammar with explosive parse times into something tractable

TL;DR: a single table directive reduced parsing time for a non-LL(1) grammar from 260 or 290 seconds to about 20 milliseconds. (So: four orders of magnitude speedup.) Great work! Thank you!

In connection with some work a colleague and I are doing, I have spent a little time writing a DCG grammar for first-order predicate calculus in something like the form I normally use if I have to type a formula in an email, and adapting code from Clocksin and Mellish to turn the formulas into clausal form.

Everything worked fine on the kind of small examples given by Clocksin and Mellish, like

(exists $x)(female($x) & child_mother($x, eve))

but it appeared utterly incapable of parsing samples of formulas we were actually interested in, like:

(all $edoc, $tdoc: doc)
 (e_t($e, $t) implies
  (all $e : token)(doc_tok($edoc, $e) implies
   (exists $t : token)
    (doc_tok($tdoc, $t)
      and e_t($e, $t)
      and (all $x : token)
      ((doc_tok($tdoc, $x) and e_t($e, $x))
       implies $x = $t))))

(This is a formulation of the English sentence “If document T is a transcript of document E, then for each token e in E there is exactly one token t in T such that token t transcribes token e, and e is the exemplar of t.”)

Some time working with smaller versions of this formula showed that it is not one particular construct on which the parser was going crazy, but that parsing time was growing polynomially, or exponentially, with the length of the input. So: really long formulas were going to be a serious problem. I don’t know how long the formula given took to parse – I gave up and cancelled it before it completed. A shorter version, minus the first two lines, took 260 seconds, or over 300 seconds, to parse.

Some time tracing (and even more time trying to figure out how to use tracing to see what was happening at a useful level of detail) helped me realize that the problem was an interaction between DCG parsing and the way the grammar was written. To keep the grammar simple (so it would be easier to make it correct), I had made no effort to factor out common prefixes, so that to distinguish among A iff B, A implies B, A and B, and A or B, the parser had to read all of A several times to get to the point where the relevant operator could be found. When “iff” is not found, the parser starts over and reads A again to see if it’s the beginning of a conditional. And so on.

I could rewrite the grammar to be more nearly LL(1), at the cost of time and tedium (especially in the rewriting of the rules for constructing the parse tree to be fed into the clausification routines). That would reduce the need for parsing the same portion of the input many times. Or I could write a chart parser, to remember how to parse A and not have to do it all over again.

At this point in my thinking, I remembered that I had noticed something in the manual saying that an implementation of tabling had been added to the system. I was not certain it would work in this case, since all the examples are of recursive predicates, and it was not clear to me whether the whole idea of memoization would work very well for grammar rules.

But it seemed worth a shot. So I tried tabling the ‘biconditional’ rule, having noticed that the parser spent a lot of time trying to parse things as biconditionals – pointlessly, in this case, since there are no biconditionals in the input. The biconditional rule is a dead end. If I could keep the parser from calling it 150 times (I don’t understand how it got to that many, but that’s how many calls I counted while tracing), maybe things would go faster. But that seemed like another reason it might not work: would a tabling implementation keep a record of inputs for which the answer is ‘fail’? Hm. Don’t know. So I tried

?- table biconditional//1.

and reparsed the sentence. Instead of 260 seconds, it took four seconds and a bit.

Experiments showed that tabling the (recursive) root non-terminal or the ‘basic_formula’ non-terminal got it down to 0.02 seconds. Tabling a larger number of non-terminals reduced the number of logical inferences but not the run time, suggesting that the tradeoff between extra overhead and more comprehensive memoization is reached quickly.

I do not believe I have ever seen a single directive reduce run time by a factor of 10,000 before. (The closest thing in my experience was telling a mainframe SQL database to update its statistics, which made queries run a few hundred times faster. That was good; this is even better.)

So this is to thank Jan – and the people who worked to develop the idea of tabling in Prolog in the first place. Great work! Thank you very much.

And to anyone else reading this: if you haven’t tried tabling, and if you ever care about performance issues (I mostly don’t, unless they affect me personally, but then I do care), then it’s worth knowing about tabling and how to turn it on.