Improving backtracing behavior


Its been a while I posted (and programmed), here is a little insight from this morning (likely obvious to many, but perhaps useful to some).

When testing code you can provide inputs to test the correct behavior of the program. However, sometimes its worth testing with inputs that will fail, just to observe how execution backtracks to search for other solutions — which it won’t find.

Observing the backtracking behavior may suggest ideas to reordering code to reduce backtracking in general as well – as happened to me this morning.

For what purpose? Used in the original thread, that just causes an infinite loop, when used as a generator:

?- phrase(id, Id).
ERROR: Stack limit (1.0Gb) exceeded

From a generator viewpoint, both id_chars1 and id_chars2 are not
suitable. I guess they can both not generate (id_chars1, " ", id_chars1)
respectively (id_chars2, " ", id_chars2). One crashes and the other

doesn’t give a fair enumeration. It will starve one of the two id_chars.
On the other hand I don’t get a crash with:

?- length(Id, 5), time(phrase(id_chars1, Id)).

In practice you would mostlikely parse an identifier with id_chars1 and not with id_chars2.
id_chars1 making it more greedy. You can try the two variants, greedy and lazy, with:

?- once(phrase((id_chars1, " ", id_chars1), "abc def")).

?- once(phrase((id_chars2, " ", id_chars2), "abc def")).

Which one is faster?

Edit 16.02.2023
You see a difference for large identifiers, you would see a difference more
pronounced if the phrase/2 argument would be a precompiled non-terminal,
now we measure also the overhead of phrase/2:

?- time((between(1,200000,_), once(phrase((id_chars1, " ", id_chars1), 
"xxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyy")), fail; true)).
% 37,600,000 inferences, 2.859 CPU in 2.852 seconds (100% CPU, 13149727 Lips)

?- time((between(1,200000,_), once(phrase((id_chars2, " ", id_chars2), 
"xxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyy")), fail; true)).
% 41,800,000 inferences, 3.109 CPU in 3.107 seconds (100% CPU, 13443216 Lips)

Was running the code with set_prolog_flag(double_quotes, chars):

ident.p.log (163 Bytes)