Recursion and weird benchmarking

As it seems that benchmarking is a theme than many are interested in, i found that link interesting to look at starting from the P-99 Ninety-NineProblems definition to the different ways to write it and the impact on performance.

% P02 (*): Find the last but one element of a list

% last_but_one(X,L) :- X is the last but one element of the list L
%    (element,list) (?,?)

last_but_one(X,[X,_]).
last_but_one(X,[_,Y|Ys]) :- last_but_one(X,[Y|Ys]).

It also gives a simple benchmarking in between SWI-Prolog and SICStus as well as “classical” Prolog vs DCG, based on different coding solutions. Some benchmarking figures look weird, especially on the SWI dcg vs dcgx example benchmarking. Any explanation for that 2.15x that grows to 7.89x ?

          SICStus     SWI
          4.3.2     7.3.20-1
    --------------+----------+--------
    f2    0.090s  |   1.449s | 16.10×
    dcg   3.670s  |   7.896s |  2.15×
    dcgx  1.000s  |   7.885s |  7.89×

(f2 is the quickest code provided in their examples)

See link here for details

Tried on current versions, resulting in about 6 times (both systems got significantly faster). Also note that the way SWI-Prolog is build (tools, options) makes an up to 50% difference. SICStus has binary releases and I’d suppose they use the best tools. For SWI-Prolog use GCC 9 or later and PGO guide optimization for the build. Some more tweaking can probably enhance this further without any changes to the sources.