A faster (and safe?) reverse

reverse/2 uses a variable named Bound, to prevent going off to infinity - as a performance penalty on each iteration.

However, since the arguments in reverse/2 can be swapped, we can instead use the more usual version, and check whether to swap the arguments to remain safe:

reverse_faster(Lst, Rev) :-
    nonvar_first_preferably(Lst, Rev, L, R),
    reverse_faster_(L, [], R).

reverse_faster_([], Ys, Ys).
reverse_faster_([X|Xs], Rs, Ys) :-
    reverse_faster_(Xs, [X|Rs], Ys).

nonvar_first_preferably(L1, L2, F1, F2) :-
    (   nonvar(L1)
    ->  (F1, F2) = (L1, L2)
    ;   nonvar(L2)
    ->  (F1, F2) = (L2, L1)
    % Preferring not to change the order, if no benefit
    ;   (F1, F2) = (L1, L2)
    ).

This is still safe, and has a surprisingly significant performance improvement:

?- numlist(1, 12_000_000, L), garbage_collect, time(reverse(L, R)).
% 12,000,002 inferences, 1.227 CPU in 1.229 seconds (100% CPU, 9775973 Lips)

% Slightly longer, with arguments swapped
?- numlist(1, 12_000_000, L), garbage_collect, time(reverse(R, L)).
% 12,000,002 inferences, 1.400 CPU in 1.401 seconds (100% CPU, 8572785 Lips)

vs

?- numlist(1, 12_000_000, L), garbage_collect, time(reverse_faster(L, R)).
% 12,000,004 inferences, 0.703 CPU in 0.703 seconds (100% CPU, 17078104 Lips)

% Arguments get swapped back, so no performance penalty
?- numlist(1, 12_000_000, L), garbage_collect, time(reverse_faster(R, L)).
% 12,000,004 inferences, 0.704 CPU in 0.704 seconds (100% CPU, 17055462 Lips)

Presumably this is a candidate to replace the built-in reverse/2?

2 Likes

Did you try measuring reversing a short list many times, instead of a very long list once? On my computer, if we reverse in the normal way, it is still faster. Probably the lack of conditional does it?

Something like this:

?- forall(between(1, 1 000 000, _), reverse([a, b], _).

You’d need to have a program that reverses very long lists, or reversing in the opposite direction; then your version would be indeed an improvement.

reverse_faster is slower with length 2, around the same with length 4, then starts being faster with length 5:

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse([a, b], _))).
% 20,000,001 inferences, 0.841 CPU in 0.842 seconds (100% CPU, 23780499 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_faster([a, b], _))).
% 28,000,001 inferences, 0.987 CPU in 0.988 seconds (100% CPU, 28376784 Lips)


?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse([a, b, c, d], _))).
% 28,000,001 inferences, 2.384 CPU in 2.386 seconds (100% CPU, 11742584 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_faster([a, b, c, d], _))).
% 36,000,001 inferences, 2.350 CPU in 2.351 seconds (100% CPU, 15320793 Lips)


?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse([a, b, c, d, e], _))).
% 32,000,001 inferences, 2.768 CPU in 2.770 seconds (100% CPU, 11562202 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_faster([a, b, c, d, e], _))).
% 40,000,001 inferences, 2.567 CPU in 2.569 seconds (100% CPU, 15581728 Lips)

Can be simplified to:

reverse_fast(Lst, Rev) :-
    nonvar(Lst),
    !,
    reverse_fast_(Lst, [], Rev).
reverse_fast(Lst, Rev) :-
    reverse_fast_(Rev, [], Lst).

reverse_fast_([], Ys, Ys).
reverse_fast_([X|Xs], Rs, Ys) :-
    reverse_fast_(Xs, [X|Rs], Ys).

… which starts being faster at length 2:

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse([a, b], _))).
% 20,000,001 inferences, 0.836 CPU in 0.837 seconds (100% CPU, 23910204 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_fast([a, b], _))).
% 20,000,001 inferences, 0.709 CPU in 0.709 seconds (100% CPU, 28213667 Lips)
1 Like

To also prevent:

?- reverse_fast([a,b|T], [c,b,a]).
T = [c] ;
ERROR: Stack limit (1.0Gb) exceeded

… have to check using is_list/1 and then nonvar/1, resulting in:

reverse_fast(L, R) :-
    % For reverse_fast([a,b|T], [c,b,a]).
    (   (   is_list(L)
        % For reverse_fast(L, [a,b]).
        ;   \+ is_list(R),
            nonvar(L)
        )
    ->  reverse_fast_(L, [], R)
    ;   reverse_fast_(R, [], L)
    ).

reverse_fast_([], Ys, Ys).
reverse_fast_([X|Xs], Rs, Ys) :-
    reverse_fast_(Xs, [X|Rs], Ys).

… which is faster than the current reverse/2 in all of:

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse(L, [a]))).
% 16,000,001 inferences, 0.866 CPU in 0.867 seconds (100% CPU, 18465102 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_fast(L, [a]))).
% 28,000,001 inferences, 0.743 CPU in 0.743 seconds (100% CPU, 37702855 Lips)


?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse([a], _))).
% 16,000,001 inferences, 0.761 CPU in 0.762 seconds (100% CPU, 21014563 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_fast([a], _))).
% 20,000,001 inferences, 0.625 CPU in 0.625 seconds (100% CPU, 32004282 Lips)


?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse(L, [a|T]))).
% 16,000,001 inferences, 0.957 CPU in 0.958 seconds (100% CPU, 16721772 Lips)

?- garbage_collect, time(forall(between(1, 4 000 000, _), reverse_fast(L, [a|T]))).
% 32,000,001 inferences, 0.910 CPU in 0.911 seconds (100% CPU, 35153104 Lips)

As helpfully pointed out by @false:

?- freeze(L,false), reverse(L,R).
false.

?- freeze(L,false), reverse_fast(L,R).
ERROR: Stack limit (1.0Gb) exceeded

… so it’s not general enough to replace reverse/2.

I thought of a different method - starting at the empty tail of the reversed list, then repeatedly adding a head:

reverse_better([], []).
reverse_better([H|T], [RH|RT]) :-
    % Prevent infinity after 1st solution: reverse_better(L, [a])
    (   is_list(T)
    ->  reverse_better_(T, H, RH, [], RT)
    ;   reverse_better_(RT, RH, H, [], T)
    ).
    
reverse_better_([], P, P, RT, RT).
reverse_better_([H|T], P, RH, RT, RTF) :-
    reverse_better_(T, H, RH, [P|RT], RTF).

This fails as intended with the freeze/2 test above, and is faster than the built-in reverse/2:

?- garbage_collect,
time(forall(between(1, 4 000 000, _), reverse([a, b], _))).
% 20,000,001 inferences, 0.907 CPU in 0.908 seconds (100% CPU, 22055617 Lips)

?- garbage_collect,
time(forall(between(1, 4 000 000, _), reverse_better([a, b], _))).
% 20,000,001 inferences, 0.794 CPU in 0.795 seconds (100% CPU, 25182747 Lips)
?- garbage_collect,
numlist(1, 100, L), time(forall(between(1, 1_000_000, _), reverse(L, _))).
% 103,000,001 inferences, 4.653 CPU in 4.656 seconds (100% CPU, 22137726 Lips)

?- garbage_collect,
numlist(1, 100, L), time(forall(between(1, 1_000_000, _), reverse_better(L, _))).
% 103,000,001 inferences, 3.362 CPU in 3.369 seconds (100% CPU, 30637854 Lips)
?- garbage_collect,
numlist(1, 12_000_000, L), time(reverse(L, R)).
% 12,000,002 inferences, 0.628 CPU in 0.629 seconds (100% CPU, 19115214 Lips)

?- garbage_collect,
numlist(1, 12_000_000, L), time(reverse_better(L, R)).
% 12,000,002 inferences, 0.366 CPU in 0.367 seconds (100% CPU, 32757865 Lips)

Edit: With swi-prolog 9.0.4, the performance benefit has gone :upside_down_face:

2 Likes