Logically pure/impure predicate implementations in the std lib

Just wanted to vent quickly. Working on something and was disappointed to see that clumped/2 is not bidirectional, as another commenter “LogicalCaptain” pointed out in a comment aptly titled “Missed opportunity”.

I understand that implementations are often a matter of tradeoffs and I’m guessing there was a reason, such as performance, for the predicate to have this limitation, but for every predicate in std lib that’s not logically pure, it would be nice if there was a corresponding pure one to give the user the option. ie. there should also be a clumped_pure/2.

I did find this SO post where the checked answer claims to have a pure version of clumped, but I have not been able to test it out because it uses scryer which I don’t have.

This is one of the holy wars of Prolog, with all kinds strong opinions getting thrown around. In general, if integers are involved it becomes quite difficult to keep it “logically pure” without getting very fancy or hacky. Either way:

You should implement this and make a PR. I assume there are other ways to get what you want out of SWI-Prolog but this one is the purest.

nowhere in the SO post you linked I see scryer mentioned at all. The accepted answer explicitly claims that it was tested with SWI-Prolog. Maybe I am seeing things.

Great answer!

just curious, why not

runs(R, L) :- phrase(runs(R), L).

?

I could be mistaken. It uses :- use_module(library(reif)). which I was not able to find in swi but was in scryer.

library(reif) is available as a pack now: "reif" pack for SWI-Prolog

Sorry for the perhaps silly question: I’m still learning. How do you work with these packages? For :- use_module(library(clpfd)). nothing really needed to be done except import from top of your source. How do I import reif?

It has been a while since I wrote this topic

Installing a SWI-Prolog pack

but it should still work or at least give you enough of the details to get you headed in the right direction.

1 Like

short answer:

?- pack_install(reif).
1 Like

Awesome, will try this. Appreciate the support.

Writing a run-length encoder is a classical rabbit hole with endless efforts at solving it… I am not sure when it would be needed, ever, but maybe it is so prominent because of the “99 Prolog Problems”. It appears there in different variations starting with P08 and maybe that’s why it keeps popping up on different forums.

If I weren’t so lazy I would make a study about how often it gets asked, what solutions are proposed, how these solutions have been maturing over time…

1 Like

It’s more elegant with CLP(FD), but unfortunately leaves a choicepoint; on the other hand, the inverse solution is not unique (although it appears that @j4n_bur53 's solution gets the minimal clumping):

:- use_module(library(clpfd)).

runs(R, L) :- phrase(runs(L), R).

runs([X-N|R]) --> {N #> 0}, run(X, N), runs(R).
runs([]) --> [].

run(X, N) --> {N #> 0, M #= N-1}, [X], run(X, M).
run(_, 0) --> [].
?- forall(runs(L, [a-2,b-1,c-3]), writeln(L)).
[a,a,b,c,c,c]
true.
?- forall(runs([a, a, b, c, c, c], L), writeln(L)).
[a-2,b-1,c-3]
[a-2,b-1,c-2,c-1]
[a-2,b-1,c-1,c-2]
[a-2,b-1,c-1,c-1,c-1]
[a-1,a-1,b-1,c-3]
[a-1,a-1,b-1,c-2,c-1]
[a-1,a-1,b-1,c-1,c-2]
[a-1,a-1,b-1,c-1,c-1,c-1]
true.
1 Like

And we don’t need clp(fd); freeze/2 and when/2 work just fine:

runs(R, L) :- phrase(runs(_, L), R).
% runs(X, R, L) :- phrase(runs(X, L), R).

runs(X, [X-N|R]) --> {freeze(N, N > 0)}, run(X, N), {dif(X,Y)}, runs(Y, R).
runs(_, []) --> [].

run(X, N) --> {freeze(N, N > 0)}, {when_plus1(M, N)}, [X], run(X, M).
run(_, 0) --> [].

when_plus1(A,C) :-
    when((nonvar(A);nonvar(C)), plus(A,1,C)).
?- runs([a, a, b, c, c, c], L).
L = [a-2, b-1, c-3] ;
false.

?- runs(L, [a-2, b-1, c-3]).
L = [a, a, b, c, c, c] ;
false.
1 Like

The below is mildly deterministic:

:- use_module(library(reif)).

clumped2([H|T], [H-N|Pairs]) :-
    clumped2_(T, H, 1, [H-N|Pairs]).

clumped2_([], P, U, [P-U]).
clumped2_([H|T], P, U, [P-N|Pairs]) :-
    if_(H = P,
        clumped2_same_([P-N|Pairs], T, H, P, U), 
        clumped2_dif_([P-N|Pairs], T, H, P, U)
    ).

clumped2_same_([P-N|Pairs], T, P, P, U) :-
    (integer(N) -> N @> U ; var(N)),
    U1 is U + 1,
    clumped2_(T, P, U1, [P-N|Pairs]).

clumped2_dif_([P-N|Pairs], L, H, P, N) :-
    clumped2_(L, H, 1, Pairs).

Results:

?- clumped2([X,X,Y,Y,Y], C).
X = Y,
C = [Y-5] ;
C = [X-2, Y-3],
dif(Y, X).

?- clumped2([a,b,b,c,c,c], C).
C = [a-1, b-2, c-3].

?- clumped2(L, [a-1, b-2, c-3]).
L = [a, b, b, c, c, c] ;
false. % Unwanted choicepoint

The pack for library(reif) doesn’t have any documentation, but I found this – is it the best place to start?

Yes, plus usages on stackoverflow.

Seems a bit more elegant, but still an unwanted choicepoint:

clumped2([]) --> [].
clumped2([Elem-Len|Pairs]) -->
    elem_same_dl(Elem, Len),
    clumped2(Pairs).

elem_same_dl(Elem, Len, [Elem|L], R) :-
    elem_same_dl_(L, R, Elem, 1, Len).

elem_same_dl_(L, R, Elem, U, Len) :-
    (   integer(Len) ->
        Len @>= U, 
        (   L = R, U = Len -> true
        ;   elem_same_dl_inc_(L, R, Elem, U, Len)
        )
    % Non-deterministic
    ;   elem_same_dl_inc_(L, R, Elem, U, Len)
    ;   elem_same_dl_dif_(L, R, Elem, U, Len)
    ).

elem_same_dl_inc_([Elem|T], R, Elem, U, Len) :-
    U1 is U + 1,
    elem_same_dl_(T, R, Elem, U1, Len).

elem_same_dl_dif_(L, L, Elem, Len, Len) :-
    (   L = []
    ;   L = [H|_],
        dif(H, Elem)
    ).

Results:

?- phrase(clumped2(Ps), [a,b,X,c]).
Ps = [a-1, b-2, c-1],
X = b ;
Ps = [a-1, b-1, c-2],
X = c ;
Ps = [a-1, b-1, X-1, c-1],
dif(X, b),
dif(X, c) ;
false.

?- phrase(clumped2([a-1,b-2,c-3]), L).
L = [a, b, b, c, c, c].

?- phrase(clumped2(Ps), [a,b,b,c,c,c]).
Ps = [a-1, b-2, c-3] ;
false. % Unwanted choicepoint

Improved determinism:

clumped2([], []).
clumped2([E|T], [E-Len|R]) :-
    (   integer(Len)
    ->  Len @>= 1
    ;   var(Len)
    ),
    clumped2_(T, R, E, 1, Len).

clumped2_(L, R, E, U, Len) :-
    % Possibilities:
    % U lt Len
    % U eq Len
    (   integer(Len),
        U @< Len
    % Improve determinism
    ->  C = lt
    ;   U == Len
    ->  C = eq
    ;   L == []
    ->  C = eq
    % Allow empty list
    ;   L = []
    ;   L = [H|_]
    ->  (   H \= E
        % U eq Len
        ->  C = eq
        ;   H == E
        % U lt Len
        ->  C = lt
        % Allow uncertain
        ;   true
        )
    % Allow uncertain
    ;   true
    ),
    clumped2_comp_(C, L, R, E, U, Len).

clumped2_comp_(eq, L, R, E, Len, Len) :-
    clumped2_after_eq_(L, E, R).
clumped2_comp_(lt, [E|T], R, E, U, Len) :-
    U1 is U + 1,
    clumped2_(T, R, E, U1, Len).

clumped2_after_eq_([], _, []).
clumped2_after_eq_([H|T], E, [RH|RT]) :-
    % Ensure next element is different
    dif(H, E),
    clumped2([H|T], [RH|RT]).

Results:

?- time(clumped2([a,X], C)).
% 33 inferences, 0.000 CPU in 0.000 seconds (92% CPU, 790438 Lips)
C = [a-1, X-1],
dif(X, a) ;
% 7 inferences, 0.000 CPU in 0.000 seconds (78% CPU, 705787 Lips)
X = a,
C = [a-2].
% Is general

?- time(clumped2([a,b,b,c,c,c], C)).
% 30 inferences, 0.000 CPU in 0.000 seconds (81% CPU, 2251407 Lips)
C = [a-1, b-2, c-3].
% No unwanted choicepoint

?- time(clumped2(L, [a-1, b-2, c-3])).
% 134 inferences, 0.000 CPU in 0.000 seconds (96% CPU, 1638262 Lips)
L = [a, b, b, c, c, c].
% No unwanted choicepoint

… and after a bit more refinement:

clumped2([], []).
clumped2([E|T], [E-Len|R]) :-
    (   integer(Len)
    ->  Len @>= 1
    ;   var(Len)
    ),
    clumped2_(T, R, E, 1, Len).

clumped2_(L, R, E, U, Len) :-
    % Possibilities:
    % U lt Len
    % U eq Len
    (   U == Len
    ->  C = eq
    ;   clumped2_l_(L, E, C)
    ),
    clumped2_comp_(C, L, R, E, U, Len).

% Trying to determine C by looking ahead
clumped2_l_([], _, eq).
clumped2_l_([H|_], E, C) :-
    (   H \= E
    ->  C = eq
    ;   H == E
    ->  C = lt
    % Unsure
    ;   true
    ).

clumped2_comp_(eq, L, R, E, Len, Len) :-
    clumped2_after_eq_(L, E, R).
clumped2_comp_(lt, [E|T], R, E, U, Len) :-
    U1 is U + 1,
    clumped2_(T, R, E, U1, Len).

clumped2_after_eq_([], _, []).
clumped2_after_eq_([H|T], E, [RH|RT]) :-
    % Ensure next element is different
    dif(H, E), 
    clumped2([H|T], [RH|RT]).

Results:

?- time(clumped2([a,X], C)).
% 35 inferences, 0.000 CPU in 0.000 seconds (89% CPU, 1582207 Lips)
C = [a-1, X-1],
dif(X, a) ;
% 8 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 779727 Lips)
X = a,
C = [a-2].

?- time(clumped2([a,b,b,c,c,c], C)).
% 36 inferences, 0.000 CPU in 0.000 seconds (89% CPU, 1160616 Lips)
C = [a-1, b-2, c-3].

?- time(clumped2(L, [a-1, b-2, c-3])).
% 137 inferences, 0.000 CPU in 0.000 seconds (91% CPU, 1483921 Lips)
L = [a, b, b, c, c, c].

Simplified further:

clumped2([], []).
clumped2([E|T], [E-Len|R]) :-
    (   integer(Len)
    ->  Len @>= 1
    ;   var(Len)
    ),
    clumped2_(T, R, E, 1, Len).

clumped2_(L, R, E, U, Len) :-
    % Improve determinism
    (   U == Len
    ->  !
    ;   U = Len
    ),
    (   L == []
    ->  !
    ;   L = [H|_],
        H \= E
    ->  !
    ;   true
    ),
    clumped2_after_eq_(L, E, R).
clumped2_([E|T], R, E, U, Len) :-
    U1 is U + 1,
    clumped2_(T, R, E, U1, Len).

clumped2_after_eq_([], _, []).
clumped2_after_eq_([H|T], E, [RH|RT]) :-
    % Ensure next element is different
    dif(H, E),
    clumped2([H|T], [RH|RT]).

Results:

?- time(clumped2([a,X], C)).
% 31 inferences, 0.000 CPU in 0.000 seconds (94% CPU, 1510059 Lips)
C = [a-1, X-1],
dif(X, a) ;
% 6 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 711238 Lips)
X = a,
C = [a-2].

?- time(clumped2([a,b,b,c,c,c], C)).
% 33 inferences, 0.000 CPU in 0.000 seconds (91% CPU, 1974629 Lips)
C = [a-1, b-2, c-3].

?- time(clumped2(L, [a-1, b-2, c-3])).
% 125 inferences, 0.000 CPU in 0.000 seconds (95% CPU, 2545203 Lips)
L = [a, b, b, c, c, c].

Just noticed that the keys should be unique (depending on purpose, I suppose) - this version enforces that:

clumped2(L, C) :-
    clumped2_(L, [], C).

clumped2_([], _Ks, []).
clumped2_([E|T], Ks, [E-Len|R]) :-
    (   integer(Len)
    ->  Len @>= 1
    ;   var(Len)
    ),
    % Ensure keys are unique
    maplist(dif(E), Ks),
    clumped2_chk_(T, R, E, [E|Ks], 1, Len).

clumped2_chk_(L, R, E, Ks, U, Len) :-
    % Improve determinism
    (   U == Len
    ->  !
    ;   U = Len
    ),
    (   L == []
    ->  !
    ;   L = [H|_],
        H \= E
    ->  !
    ;   true
    ),
    clumped2_after_eq_(L, Ks, R).
clumped2_chk_([E|T], R, E, Ks, U, Len) :-
    U1 is U + 1,
    clumped2_chk_(T, R, E, Ks, U1, Len).

clumped2_after_eq_([], _Ks, []).
clumped2_after_eq_([H|T], Ks, [RH|RT]) :-
    clumped2_([H|T], Ks, [RH|RT]).

Results:

?- time(clumped2(L, [a-1, b-2, X-3])).
% 77 inferences, 0.000 CPU in 0.000 seconds (94% CPU, 1790240 Lips)
L = [a, b, b, X, X, X],
dif(X, b),
dif(X, a).
1 Like