History dependent semi_lex_compare/3

Splitting off the topic. The predicate appeared in this
thread here and in this thread here. But mostlikely it
deserves it own thread. I was asked by @kuniaki.mukai

to do some testing. And I found this strange failure:

/* SWI-Prolog 9.1.7 */
?- b_setval(compare_history, []), A=(B,2), B=(A,1), C=(C,1), 
semi_lex_compare(X, A, B), semi_lex_compare(Y, B, C), semi_lex_compare(Z, A, C).

Now that there are many proposals, there is also the one from
@ridgeworks not to forget, for cyclic term comparison, of course
having them all in place and working might be beneficial, to gain

some insight into the problem. But first make them working!
Was using this version, that @kuniaki.mukai had posted last:

semilex.pl.log (1,5 KB)

Sorry, I migrated a typo.

compare_args_with_stack(=, _, _, _).    
compare_args_with_stack(=, _, _, _, _).
% ?- b_setval(compare_history, []), A=(B,2), B=(A,1), C=(C,1), 
% semi_lex_compare(X, A, B), semi_lex_compare(Y, B, C), semi_lex_compare(Z, A, C).
%@ A = ((A, 1), 2),
%@ B = (A, 1),
%@ C = (C, 1),
%@ X = (<),
%@ Y = Z, Z = (>).

To make this note short, some symbols are used without explanation.

  1. \Sigma is a signature \Sigma=\{a/0, b/0, \ldots, f/2, g/2, \ldots\}.
  2. H is the Herbrand universe over \Sigma.
  3. R is the set of rational trees over \Sigma.

Being inspired by the Matts Carllson’s pair (x, y), where x=f(y, a), y=f(x, b), I have found a simple operation \Gamma on the powerset of R\times R which defines the set of incomparable pairs like (x,y).

Define an endo operation \Gamma: for Q\subset R\times R,

\Gamma(Q) = \{(g(p,a_2,...,a_n), g(q, b_2, ..., b_n)) \mid g \in \Sigma, (p, q) \in Q, a_2,\ldots,a_n, b_2,\ldots,b_n \in R\}.

Clearly \Gamma is monotone increasing on the powerset of R\times R. Hence from fixpoint theorem, there exists the largest fixpoint D of \Gamma.

Let E be the fixpoint of the endo operation F on the powerset R\times R, where
the definition of F is the same one that was posted already, (which is that I said too simple. )

It should be easy to see that D has following properties.

  1. Matts Carllson’s pairs (x, y) and (y, x) are elements of D.
  2. E is an extension of the standard lexical order on H.
  3. There exists some subset D' of D such that E\setminus D' is the total linear order on R.

1 and 2 are trivial, and thinking for a while, it should be easy to show 3.

J. Burse pointed out that SWI-Prolog sort/2 has a bug on rational trees. If it is a really bug, I guess without any evidence, I am afraid it is caused by wrong selection w.r.t 3.

I hope my semi_lex_sort is bug free around 3 because of “history dependent comparing”. Anyway, this is only a speculation without reading the code of the standard compare. I will have to aplogize if it is not the case.

Finally, to be honest, IMHO, sort needs not care of rational trees. I have several technical reasons, for that, but it is difficult for me to express in English for now. My interest was Matts Carllson’s pair itself.
I maybe wrong on this opiniton. I always encourage any effort for efficient sorting on rational trees.

Nope, sort/2 mostlikely doesn’t have a bug. I nowhere claimed that for
SWI-Prolog! The primary source of bugs is compare/3, which is used during
sort/2. Whereby there are many possible types of bugs in compare/3,

that might give strange results in sort/2. To mention a few, there are transitivity
violation, lexical order violation and substitution of equals violation. The Matt
Carlsson pair shows that lexical order violations cannot be removed:

  • Transitivity violation:
    A test case of the form A @< B, B @< C, \+ A @< C.

  • Lexical order violation:
    A test case of the form s(A,B) @< s(C,D), A \== C, \+ A @< C.

  • Substitution of equals violation:
    A test case of the form A == B, A @< C, \+ B @< C.

  • What else?
    There are a couple more constraints.

You can also use the fuzzer to find the bugs in native compare/3. They are
quite rare, but they are there. If you use the fuzzer, unlike your semi_lex_less/2,
which currently crashes, the SWI-Prolog native compare doesn’t crash,

but it doesn’t pass all random test cases, if you use a larger number of tests:

/* SWI-Prolog 9.1.7 */
?- aggregate_all(count, (between(1,1000000,_), random_cyclic(A),
random_cyclic(B), random_cyclic(C), A @< B, B @< C, \+ A @< C), F).
F = 7.

So out of 1’000’000 test cases, there were 7 faulty test cases.

Edit 01.04.2023
Scryer Prolog passes the same test, no transitivity violation so far.
Although the test needs a little bit more time to execute than in SWI-Prolog:

/* Scryer Prolog 0.9.1-207 */
?- between(1,1000000,_), random_cyclic(A),
random_cyclic(B), random_cyclic(C), A @< B, B @< C, \+ A @< C.

For Trealla Prolog, I guess it doesn’t pass the above test at
this time. I get a run that stops responding to Ctrl-C.

semi_lex_compare is for only ground terms.
Inserting ground check, transitivity test seems passed.

% ?- b_setval(compare_history, []), aggregate_all(count,
% (between(1,1000000,_), random_cyclic(A), ground(A), random_cyclic(B), ground(B),
% random_cyclic(C), ground(C), semi_lex_less(A,B), semi_lex_less(B,C),
% \+ semi_lex_less(A,C)), F).
%@ F = 0.

Thank you for correction. BTW, is this a bug of @< ?

Borrowing the Monkey test, I have tested @<.

% ?- b_setval(compare_history, []), aggregate_all(count,
% (between(1,1000000,_), random_cyclic(A), ground(A), random_cyclic(B), ground(B),
% random_cyclic(C), ground(C), A @<B, B @<C, \+ A @< C), F).
%@ F = 2.

Native compare/3 has violations for both ground and non-ground terms.

If you filter ground/1 you get fewer tested cases, and you get a smaller count.
So that it drops from F≈7 to F≈2 is not unlikely. Also every test run might show
a different number, since the random number generator might not hit

the same test cases in each run. But for use cases such as term_factorized/3,
which uses rbtrees in turn, I guess you need a compare/3 that can also
deal with non-ground terms.

Do you have a semi_lex_compare/3 that also works for non-ground terms?
The proposals of @ridgeworks and my proposal are supposed to both work
for non-ground terms as well, but they depend on term_factorized/3,

which seems to be buggy at the moment:

Glitch in term_factorized/3, stack overflow

Edit 01.04.2023
Currently the fuzzer does not sample non-ground terms very well, for
example it would not generate s(X,X) as a test case, i.e. multiple occurence
of the same variable. Maybe should work on a new version of the fuzzer.

Variables are rather generated by accident as a special case of a circular term.
X = X, is a circle of length zero, if I remove randomly generated subterms,
that are variables, the fuzzer would generate only ground terms.

I did not think non_ground term case. Assuming no unification performed, it should work for
non_ground case as well, but I am not sure.

I got lost in the math but monkey testing caught my attention again. Because my version of logical compare defers on variables, I made a small mod to random_cyclic/1:

% random_cyclic(-Term)
random_cyclic(T) :-
   random_cyclic([T], T),
   numbervars(T,0,_).  % ground version

I believe comparable/3 basically uses the same strategy as compare_with_stack so I ran the follwing query several times:

between(1,1000000,_), random_cyclic(T1), random_cyclic(T2), random_cyclic(T3),
comparable(Ct,T1,T2), comparable(Ct,T2,T3), comparable(C,T1,T3), C \= Ct.

I actually had a couple of runs with no failures but most had 1 and occasionally 2; never saw more than 2.

Here’s the output from 1 run of 4,000,000 with 5 fails:

T1 = s(s(T1, A), T1),
T2 = s(_S1, 1), % where
    _S1 = s(s(_S1, 1), _S1),
T3 = s(s(_S2, A), B), % where
    _S2 = s(s(_S2, A), _S2),
Ct = (<),
C = (>) ;
T1 = s(_S1, 0), % where
    _S1 = s(_S1, 1),
T2 = _S2, % where
    _S2 = s(s(_S2, s(_S3, 1)), 0),
    _S3 = s(_S3, s(_S3, 1)),
T3 = s(T3, s(0, s(0, A))),
Ct = (>),
C = (<) ;
T1 = s(s(_S1, 0), A), % where
    _S1 = s(s(_S1, 0), _S1),
T2 = s(T2, 1),
T3 = s(s(T3, 0), T3),
Ct = (>),
C = (<) ;
T1 = _S2, % where
    _S1 = s(_S1, s(0, _S2)),
    _S2 = s(_S1, _S2),
T2 = s(_S3, A), % where
    _S3 = s(s(_S3, A), _S3),
T3 = _S4, % where
    _S4 = s(_S5, s(_S4, _S6)),
    _S5 = s(s(_S4, 0), s(_S5, 1)),
    _S6 = s(0, _S6),
Ct = (>),
C = (<) ;
T1 = s(_S1, 0), % where
    _S1 = s(s(_S1, 0), _S1),
T2 = s(_S2, B), % where
    _S2 = s(_S2, _S3),
    _S3 = s(s(s(_S2, B), A), _S3),
T3 = s(T3, s(T3, 1)),
Ct = (<),
C = (>) ;

Failures are quite rare but haven’t been able to figure out what triggers them.

I’ve lost sight more than once. comparable/3 is a logical compare that uses the stack strategy to manage the comparison of cyclic terms. It’s current incarnation:

:- module(comparable, [comparable/3]).

comparable(Op,X,Y) :- 
comparable(Op,X,Y,_) :- Op == =, !,                   % Op = =, unify X and Y
	X = Y.
comparable(=,X,Y,_)  :- X == Y, !.                    % X and Y identical, Op = =
comparable(Op,X,Y,P) :- compound(X), compound(Y), !,  % comparing two unequal compound terms
	(memberchk_pair(P,X,Y)        % cyclic check ? 
	 -> Op = =                    % cyclic, Op = = to continue (shallow) compare
	 ;  functor(X,XF,XA), functor(Y,YF,YA),
	    compare(FOp,(XA,XF),(YA,YF)),  % compare arities and functors, as per standard order
	    ( (FOp == =)
	     -> compound_name_arguments(X,_,XArgs),       % arities,functors match,
	 	    comparableArgs(XArgs,YArgs,Op,[(X,Y)|P])  % arg lists are same length
	     ;  Op = FOp                                  % arities unequal, we're done
comparable(Op,X,Y,P) :-                               % else X and/or Y are vars
	(nonvar(X), nonvar(Y)
	 -> compare(Op,X,Y)                               % comparing 2 other nonvars
	 ;  when((nonvar(X),nonvar(Y)), comparable(Op,X,Y,P))  % var(s), defer
comparableArgs([X|XArgs],[Y|YArgs],Op,P) :-
	(nonvar(IOp)  % optimization: is arg compare Op defined?
	 -> comparableArg(IOp,Op,XArgs,YArgs,P)                     % yes, test
	 ;  when(nonvar(IOp), comparableArg(IOp,Op,XArgs,YArgs,P))  % no, defer

% arg compare Op defined, continue if =, else exit
comparableArg(IOp,Op,XArgs,YArgs,P) :-
	(IOp = =) -> comparableArgs(XArgs,YArgs,Op,P) ; Op = IOp. 

% memberchk pair based on identity
memberchk_pair([(A,B)|P],X,Y) :- 
	(X==A, Y==B -> true ; memberchk_pair(P,X,Y)).

No, it’s meant to implement a logical compare (defer when a var encountered) to address the the OP here How to compare/3 without surprises on non-ground terms? and ran into the cyclic term comparison issue as an extension.

I tried to work with term_factorized but abandoned it when compare_with_stack seemed to be the better option and I haven’t seen a reason to revisit it.

Did you stress test compare_with_stack/3?
It seems to pass the transitivity violation test:

?- between(1,4000000,_), random_cyclic(A), ground(A),
random_cyclic(B), ground(B), random_cyclic(C), ground(C),
less_with_stack(A, B), less_with_stack(B, C), \+ less_with_stack(A, C).

Why do you find failures in comparable/3? Maybe nevertheless
a steadfastness problem and the way you formulated the test?
Or some other problem? I used this bootstrapping:

less_with_stack(X, Y) :- compare_with_stack(C, X, Y), C = (<).

Edit 01.04.2023
The reason to revise it is a) its speed, and b) if one would
like to also compare non-ground terms. But so far term_factorized/3
didn’t beat yet compare_with_stack/3:

I have now and it seems to be as bad if not worse (using semilex.pl.log file and and original random_cyclic):

?- between(1,1000000,_),random_cyclic(T1),random_cyclic(T2),random_cyclic(T3),
T1 = _S2, % where
    _S1 = s(s(_S1, _S1), 0),
    _S2 = s(s(_S1, _S1), _S2),
T2 = s(s(_S1, _S1), 0),
T3 = s(s(T3, 0), 1),
Ct = (>),
C = (<) ;
T1 = s(_S1, _S2), % where
    _S1 = s(s(_S1, 0), _S1),
    _S2 = s(s(s(_S1, _S2), _S2), _S3),
    _S3 = s(0, _S3),
T2 = s(_S1, 0),
T3 = s(T3, 1),
Ct = (>),
C = (<) ;
T1 = s(_S1, 1), % where
    _S1 = s(s(_S1, 1), _S1),
T2 = s(T2, s(0, 1)),
T3 = s(s(T2, s(0, 1)), 0),
Ct = (>),
C = (<) ;
T1 = s(s(T1, 1), T1),
T2 = _S1, % where
    _S1 = s(_S1, _S2),
    _S2 = s(1, _S2),
T3 = s(s(_S1, _S2), T3),
Ct = (<),
C = (>) ;
T1 = s(_S1, 0), % where
    _S1 = s(s(_S1, 0), _S1),
T2 = s(s(s(_S1, 0), _S1), 1),
T3 = s(T3, 1),
Ct = (<),
C = (>) ;
T1 = s(_S1, 1), % where
    _S1 = s(_S1, s(_S1, s(1, _S1))),
T2 = s(_S1, s(_S1, s(1, _S1))),
T3 = s(_S2, 1), % where
    _S2 = s(s(_S2, 1), _S3),
    _S3 = s(_S2, _S3),
Ct = (<),
C = (>) ;
T1 = s(T1, 1),
T2 = _S1, % where
    _S1 = s(s(_S1, _S2), 0),
    _S2 = s(s(1, _S2), 0),
T3 = s(s(_S1, _S2), 1),
Ct = (<),
C = (>) ;
T1 = _S2, % where
    _S1 = s(s(_S1, 0), _S1),
    _S2 = s(_S1, s(_S2, 0)),
T2 = s(_S1, 0),
T3 = s(T3, 1),
Ct = (>),
C = (<) ;
T1 = s(s(T1, 0), 1),
T2 = s(s(T2, T2), 0),
T3 = s(s(T2, T2), s(T3, T3)),
Ct = (<),
C = (>) ;
T1 = s(s(_S1, 1), 1), % where
    _S1 = s(s(_S1, 1), _S1),
T2 = s(s(_S1, 1), _S1),
T3 = s(s(T3, T3), 1),
Ct = (<),
C = (>) ;

Definitely a problem, but I’m having difficulty “reverse-engineering” this test output (with %where’s) back to Prolog to re-construct the fail cases.

Thanks, don’t know why that didn’t occur to me.

I’ve lost track; has an implementation that doesn’t violate transitivity been found yet?

The representation based comparison does this feat. Only I originally
based it on term_factorized/3, which gave me a stack overflow, so I
couldn’t complete my testing. But when I use my custom factorize/3,

even the version of the predicate that doesn’t minimize, I easily get:

?- aggregate_all(count, (between(1,1000000,_), random_cyclic(A),
random_cyclic(B), random_cyclic(C), rep_less(A,B),
    rep_less(B,C), \+ rep_less(A,C)), F).
F = 0.

The source code is here:

rep2.p.log (2,2 KB)

I can find nothing wrong with this as a solution to cyclic term compare. Now what (for SWIP)? Possibly some combination of:

  1. Fix builtin compare/3 to implement the same semantics. The problem is nobody is currently able and willing to take this on.
  2. Subject to possible renaming, add factorize/3 and rep_compare/3 SWI-Prolog’s library(terms). This looks like minimal effort to provide a workaround with the necessary semantics pending a compare/3 fix.
  3. Document the problem (like Sicstus).
  4. Do nothing and reference this thread for anybody else with the same question.
  5. ??

Is minimizing a requirement for comparison of unequal cyclic terms? (Builtin == appears to be correct for identical terms.)

And suppose you only factored cyclic_terms and took additional steps to make them quasi-unique, e.g.,

rep(X, '$\a'(Y,Z)) :- cyclic_term(X), !,
	factorize(X, Y, Z),
	numberassoc(Z, 0).
rep(X, X).

Looking for practical solutions here.

At this point I would settle for correct before cheap; leave any tradeoffs up to the user.

And at least in SWIP, cyclic_term doesn’t look very expensive:

?- time((between(1,1000000,_), random_cyclic(T),fail;true)).
% 14,564,700 inferences, 2.039 CPU in 2.062 seconds (99% CPU, 7143694 Lips)

?- time((between(1,1000000,_), random_cyclic(T),cyclic_term(T),fail;true)).
% 15,565,099 inferences, 2.126 CPU in 2.141 seconds (99% CPU, 7321559 Lips)

I am lost an appropriate thread to post this. But, as it seems relevant, please allow me posting to this thread.

I borrowed the date T1, T2, T3, in the query below from J.Burse somewhere recently. Thanks.

T1, T2, T3 in the query are perhaps incomparable with each other.
The query shows compare_with_stack/5 with history forces linear ordering T1 > T2, T2< T3, T1 < T3
(T2 < T1 < T3) due to the history of comparing.

BTW, my compare_with_stack/3 was changed silently(?) to return <, >, =, and ‘incomparable’.
When compare_with_stack/3 returns ‘incomparable’ compare_with_stack/5 forces order in a conservative way and memorize the order in the history for later use (like tabling).
Note that it keeps consistent orderering, which is my idea for such incomparable pairs like
Matts Carllson’s pair.

I hope this query is relevant to this thread.

% ?- T1 = _S2,
%	 _S1 = s(s(_S1, _S1), 0),
%    _S2 = s(s(_S1, _S1), _S2),
%	T2 = s(s(_S1, _S1), 0),
%	T3 = s(s(T3, 0), 1),
%	compare_with_stack(C1,T1,T2, [], A),
%	compare_with_stack(C2,T2,T3, A, B),
%	compare_with_stack(C3,T1,T3, B, C),
%	b_setval(compare_history, C),
%	predsort(semi_lex_compare, [T1, T2, T3], R),
%	R = [T2, T1, T3].

%@ T1 = _S2, _S2 = _S2, % where
%@     _S1 = s(s(_S1, _S1), 0),
%@     _S2 = s(s(_S1, _S1), _S2),
%@ _S1 = T2, T2 = s(s(_S1, _S1), 0),
%@ T3 = s(s(T3, 0), 1),
%@ C1 = (>),
%@ A = [],
%@ C2 = C3, C3 = (<),
%@ B = [s/2-[s(s(_S1, _S1), 0), s(s(T3, 0), 1)]],
%@ C = [s/2-[s(s(_S1, _S1), _S2), s(s(_S1, _S1), 0), s(s(T3, 0), 1)]],
%@ R = [s(s(_S1, _S1), 0), s(s(_S1, _S1), _S2), s(s(T3, 0), 1)].

As far as comparable pairs are concerned it works as the standard compare/3,
For incomparable pairs like Matts Carllson’s one the order is not defined.