Cyclic terms unification `X=f(f(X), X), Y=f(Y, f(Y)), X = Y. `

icl_compare/3 below compares rational terms. So far, fuzzy test fails to find a counter example for transitivity. So I am posting the codes as I posted to do so.

The codes relies on ==/2 heavily, and I borrowed idea from @janW’s post using length of the initial cycle on the path of subterm series, which I saw vaguely a couple of years ago. It is appreciated if the code is tested by others. In fact, I’m in doubt.

% ?- icl_compare(C, a, b).
% ?- icl_compare(C, b, a).
% ?- X=(Y,0), Y=f(X,1), icl_compare(C, X, Y).

icl_compare(C, I, J):- icl_order(C, [0*0, I-J], []).

% icl: initial cycle length based ordering
icl_order(=, [], []):-!.
icl_order(C, [], [[_,_|Ps]|A]):-!, icl_order(C, Ps, A).
icl_order(C, [_, I-J|Ps], A):-
	(	I == J -> icl_order(C, Ps, A)
	;	\+ (compound(I), compound(J)) -> compare(C, I, J)
	;	compare_arity(D, I, J),
		(	D = (=) ->
			initial_cycle_length(I-J, A, L*R, 1),
			(	nonvar(L), nonvar(R) ->
				compare(E, L, R),
				(	E = (=) ->
					A = [[_,_|Qs]|A0],
					icl_order(C, Qs, A0)
				;	C = E
				)
			;	I =.. [_|Is],
				J =.. [_|Js],
				zipper(Is, Js, Zip),
				(	var(L) -> L = 0
				;	true
				),
				(	var(R) -> R = 0
				;	true
				),
				icl_order(C, Zip, [[L*R, I-J|Ps]|A])
			)
		;	C = D
		)
	).

% ?- initial_cycle_length(a-b, [[0*0, a-b]], L*R, 1).
% ?- initial_cycle_length(a-b, [[0*0, a-c], [0*0, d-b]], L*R, 1).
initial_cycle_length(_, [], _, _):-!.
initial_cycle_length(_, _, L*R, _):- nonvar(L), nonvar(R), !.
initial_cycle_length(X-Y, Cs, L*R, K):-
	Cs=[[L0*R0, A-B|_]|Cs0],
	(	nonvar(L) -> true
	;	L0>0 -> L = L0
	;	X == A -> L = K
	;	true
	),
	(	nonvar(R) -> true
	;	R0>0 -> R = R0
	;	Y == B -> R = K
	;	true
	),
	K0 is K + 1,
	initial_cycle_length(X-Y, Cs0, L*R, K0).
%
compare_arity(C, A, B):- functor(A, F, J),
		functor(B, G, K),
		compare(C, J/F, K/G).

% ?- zipper([a,b],[1,2], Z).
zipper([], [], []).
zipper([A|As], [B|Bs], [A-B|Cs]):-
	zipper(As, Bs, Cs).

(Nothing of importance)

Fixed. Thanks. Fortunately, it’s careless bug as usual for me.

icl_order(=, [], []):-!.
icl_order(C, [], [[_,_|Ps]|A]):-!, icl_order(C, Ps, A).
icl_order(C, [LPair, I-J|Ps], A):-       %% was bug.
	(	I == J -> icl_order(C, Ps, A)
	;	\+ (compound(I), compound(J)) -> compare(C, I, J)
	;	compare_arity(D, I, J),
		(	D = (=) ->
			initial_cycle_length(I-J, A, L*R, 1),
			(	nonvar(L), nonvar(R) ->
				compare(E, L, R),
				(	E = (=) ->
					A = [[_,_|Qs]|A0],
					icl_order(C, Qs, A0)
				;	C = E
				)
			;	I =.. [_|Is],
				J =.. [_|Js],
				zipper(Is, Js, Zip),
				(	var(L) -> L = 0
				;	true
				),
				(	var(R) -> R = 0
				;	true
				),
				icl_order(C, [L*R|Zip], [[LPair, I-J|Ps]|A])  % was bug.
			)
		;	C = D
		)
	).

(Nothing of importance)

Did you mean this ?

?- icl_compare(C, X, 1).
%@ C = (<).

(Nothing of importance)

(Nothing of importance)

Is this what is said that ==/2 is not compatible

icl_compare/3 is not related to the “problem”, I hope.

% ?-  X = s(X, 1),
% Y = s(Y, _),
%	icl_compare(C, X, Y),
%	\+ X == Y.
%@ X = s(X, 1),
%@ Y = s(Y, _),
%@ C = (>).

Do you have more corrections that you didn’t show us?
I was using what you posted , plus the correction you posted:

% ?- icl_compare(C, a, b).
% ?- icl_compare(C, b, a).
% ?- X=(Y,0), Y=f(X,1), icl_compare(C, X, Y).

icl_compare(C, I, J):- icl_order(C, [0*0, I-J], []).

% icl: initial cycle length based ordering
icl_order(=, [], []):-!.
icl_order(C, [], [[_,_|Ps]|A]):-!, icl_order(C, Ps, A).
icl_order(C, [LPair, I-J|Ps], A):-       %% was bug.
	(	I == J -> icl_order(C, Ps, A)
	;	\+ (compound(I), compound(J)) -> compare(C, I, J)
	;	compare_arity(D, I, J),
		(	D = (=) ->
			initial_cycle_length(I-J, A, L*R, 1),
			(	nonvar(L), nonvar(R) ->
				compare(E, L, R),
				(	E = (=) ->
					A = [[_,_|Qs]|A0],
					icl_order(C, Qs, A0)
				;	C = E
				)
			;	I =.. [_|Is],
				J =.. [_|Js],
				zipper(Is, Js, Zip),
				(	var(L) -> L = 0
				;	true
				),
				(	var(R) -> R = 0
				;	true
				),
				icl_order(C, [L*R|Zip], [[LPair, I-J|Ps]|A])  % was bug.
			)
		;	C = D
		)
	).

% ?- initial_cycle_length(a-b, [[0*0, a-b]], L*R, 1).
% ?- initial_cycle_length(a-b, [[0*0, a-c], [0*0, d-b]], L*R, 1).
initial_cycle_length(_, [], _, _):-!.
initial_cycle_length(_, _, L*R, _):- nonvar(L), nonvar(R), !.
initial_cycle_length(X-Y, Cs, L*R, K):-
	Cs=[[L0*R0, A-B|_]|Cs0],
	(	nonvar(L) -> true
	;	L0>0 -> L = L0
	;	X == A -> L = K
	;	true
	),
	(	nonvar(R) -> true
	;	R0>0 -> R = R0
	;	Y == B -> R = K
	;	true
	),
	K0 is K + 1,
	initial_cycle_length(X-Y, Cs0, L*R, K0).
%
compare_arity(C, A, B):- functor(A, F, J),
		functor(B, G, K),
		compare(C, J/F, K/G).

% ?- zipper([a,b],[1,2], Z).
zipper([], [], []).
zipper([A|As], [B|Bs], [A-B|Cs]):-
	zipper(As, Bs, Cs).

No. The two lines in the icl_compare/3 indicated by comment " % was bug" are all.
(macOS 15.6 iMac intel 2020, m4 MacBook pro, latest swipl via GitHub source).

I wrote my ==/2 (named cong/2) using same_term/2 and a list for book keeping bisimulation. Thus replacing ==/2 with my cong/2, I saw nothing different with icl_compare/3 you have.

Surely it is strange that when the \+ _ ==_ is used in fuzzy test, something strange happens.

You may be right. I reviewed more on the codes, and have found one of clauses still to be reviewed in deep.

Roughly, icl_order is left-most derivation of a context free grammar, very roughly. But left most non terminal has extra information on cycle lengths. It depends on context whether dropping or keeping the extra information. I am explaining in analogy. But needs time to fix it in clear mind.

Following the above naive analogy, I rewrote icl_compare so that my confusion should be minimized. There were surely confusion on L*R information (pair of lengths of initial cycles of Left and Right paths from the top). It should be of the parent node which is shared with siblings. This update fixes all pointed out so far. Thanks.

icl_compare(C, I, J):- icl_order(C, [[0*0, I-J]]).
%
icl_order(=, []):-!.
icl_order(C, [[_] | Anc]):-!, icl_order(C, Anc).
icl_order(C, [[LPair,I-J|Ps] | Anc]):-!,
	(	I == J -> icl_order(C, [[LPair|Ps] | Anc])
	;	\+ (compound(I), compound(J)) -> compare(C, I, J)
	;	compare_arity(D, I, J),
		(	D = (=) ->
			initial_cycle_length(I-J, Anc, L*R, 1),
			(	nonvar(L), nonvar(R) ->
				compare(E, L, R),
				(	E = (=) ->
					icl_order(C, [[LPair|Ps] | Anc])
				;	C = E
				)
			;	I =.. [_|Is],
				J =.. [_|Js],
				zipper(Is, Js, Zip),
				(	var(L) -> L = 0
				;	true
				),
				(	var(R) -> R = 0
				;	true
				),
				icl_order(C, [[L*R|Zip], [LPair, I-J|Ps] | Anc ])
			)
		;	C = D
		)
	).

Sorry to hear that. Nevertheless I must thank you for testing old icl_compare/3. BTW, I posted already the full codes of new icl_compare/3. The new version fixes all which were pointed as bugs.

I do not intend to waste your valuable time any more, but I would like to continue to develop this approach by myself, which is inspired by idea to use length of first cycle of branch as a character for comparing two “incomparable” branches. This idea was posted in this forum a couple of years ago, I believe. I don’t know exact reason why it discontinued. In this direction, we will finally get an order which is conservative for the standard order on finite terms, total, transitive. If we would achieve such goal, it would never successful without the fuzzy test tool.

Thanks again. Bye.

% ?- X=f(Y,0), Y=f(X,1), icl_compare(C, X, Y), \+ X==Y.
%@ X = f(f(X, 1), 0),
%@ Y = f(X, 1),
%@ C = (>).

% ?-  call_with_time_limit(100, (  repeat,fuzzy(X), fuzzy(Y), icl_compare(C, X, Y), C = (=),
%	\+ X == Y)).
%@ ERROR: Unhandled exception: Time limit exceeded

% ?-  X = s(_, 1), Y = s(_, 1), icl_compare(C, X, Y).
%@ X = s(_, 1),
%@ Y = s(_, 1),
%@ C = (<).

% ?-  X = s(A, 1), Y = s(B, 1), icl_compare(C, X, Y).
%@ X = s(A, 1),
%@ Y = s(B, 1),
%@ C = (<).

% ?-   call_with_time_limit(100, (repeat,fuzzy(X), ground(X),  fuzzy(Y), ground(Y), icl_compare(C, X, Y), C = (=),
%	\+ X == Y)).
%@ ERROR: Unhandled exception: Time limit exceeded

(Nothing of importance)

EDIT 2025/8/24
I found difficuly in icl_compare/3 on transitivity. The following
might explain the difficulty. Suppose four terms 1, 2, 3, 3’

1 A = f(a, b, c)
2 B = f(a, d, e)
3 C = f(a, d, f)
3’ C’= f(a, d’,f)

by 1,2 we have A < B
by 2,3 B < C
by 1,3 A < C

However, if d and d’ are “incomparable” like Mats Carlsson’s pair, we can not have A<C’ from A<B and B<C’.

If isomorphic terms share unique index, then transitivity works exactly like that of finite terms. Details will be posted elsewhere in the near future. Fortunately, using ==/2, it is quite easy to give such unique index. In fact, I already posted using minimum coalgebra. I will post revised version with minimum coalgebra generated by ==/2, not same_term/2.

Original Post
icl-compare.pl (1.9 KB)

I am posting a draft of icl_compare/3.

icl(-C, +X, +Y) is det.

Given rational terms X, Y,
C is unified with =, <, or > as follows:

  • = if X == Y (isomorphic)
  • < if Y is greater than X in lexicographic order
  • > otherwise

icl_compare/3 is meant as an extension of the standard compare/3 on finite terms. It tries to provide a total linear ordering on rational trees, i.e. reflexive, antisymmetric, and transitive. I believe it can be used as the compare predicate in predsort/3.

Roughly speaking, given a pair of rational terms X, Y,
icl_compare(C, X, Y) behaves as if it builds a bisimulation starting from X and Y, and walks on both trees in parallel in a depth-first, left-to-right manner. Since X and Y are assumed rational, each path has finite depth, so the overall walk is finite. This seems to guarantee termination for any X, Y. I think these two properties form the basis for correctness, in particular for transitivity, though I may be missing something.

For guaranteeing that each branch depth decreases, the ordering on branches turned out to be critical. I tried to note some details on that part.

The uploaded version includes a small change, which looks tiny but I believe is essential for keeping depths finite.

It still feels a bit strange to me that no counterexamples for transitivity appeared in the older version, even though I ran a fuzzer with a time limit of 2,000 seconds. In earlier buggy versions, counterexamples used to show up within 10 seconds. Although I still have this unanswered question in mind, I decided to post the current version as “final” for now.

It relies heavily on ==/2 and otherwise works just like the standard compare/3, without any preprocessing step.

Note on ordering on sub-term sequences.

\def\arg{\mathrm{arg}} \def\prefix{\mathrm{prefix}} \def\cycle{\mathrm{cycle}}

If A is a compound term of Prolog, \arg(i, A) denotes i-th argument of A. For example if A = f(a,g(b),c) then \arg(1, A)=a, \arg(2,A)= g(b), \arg(3,A)=c. B is called a subterm of A if B=\arg(i, A) for some i.

A sequence \gamma:= A_1, A_2, \ldots, A_n (n>1)
is called a subterm sequence if A_{i+1} is a subterm of A_i (1\leq i<n). n is called the length of the sequence, and written
n=|\gamma|=|A_1, A_2, \ldots, A_n|.

Let \Gamma be the set of subterm sequences
\gamma := A_1, A_2, \ldots, A_n such that A_i\neq A_j (i\not= j) for 1\leq i, j<n and A_k=A_n for some 1\leq k<n.

The subsequence A_i,A_{i+1}\ldots,A_n is called the cycle of the sequence, and written \cycle(\gamma). i is called the start of the cycle of \gamma, and n-i+1 is called the length of the cycle of the sequence. The subsequence A_1, A_2, \ldots, A_{i-1} (i>1) is called the prefix of \gamma, and is written \prefix(\gamma). When i=1, by convention, \prefix(\gamma)=\varepsilon (empty sequence), and the length of \varepsilon is defined to be 0.

For example, suppose \gamma is an sequence \gamma:=A, B, C, D, C
in \Gamma, where A, B, C, D are distinct. Then |\gamma|=5, the start of the cycle of \gamma is 3, |\cycle(\gamma)|=2.

For \gamma, \delta \in \Gamma, we define \gamma < \delta to mean that the following conditions holds:

  • |\cycle(\gamma)| < |\cycle(\delta)|, or
  • |\cycle(\gamma)| = |\cycle(\delta)| and the start of \cycle(\gamma) is less than that of \cycle(\delta).

We define \gamma\sim\delta to mean that neither \gamma<\delta nor \delta<\gamma holds.

It should be clear that for \gamma, \delta\in\Gamma, exactly one of the following holds.

  • \gamma<\delta.
  • \delta<\gamma.
  • \gamma\sim\delta.

Also clearly it follows from \gamma\sim\delta that |\gamma|=|\delta|, |\cycle(\gamma)|=|\cycle(\delta)|, and |\prefix(\gamma)|=|\prefix(\delta)|.

Thanks for reporting. Fixed. I saw transitivity test passed for 255152 terms within 200 seconds thanks to fuzzy. I am ashamed to confess that trace, spy, and inserting snapshot (gtrace, … not at all) are still all I can use for debugging Prolog codes.

Please wait. I am going to read the SWI-Prolog manual on testing codes. It takes one week for me, who is slow to read manual.

I wrote codes (mcoa_vec/2) to compute the minimum coalgebra for given a rational term using ==/2. Output is of a vector form:

Using mcoa_vec/2, I defined a comparing predicate mcomp/3

mcomp(C, X, Y):- 
       mcoa_vec(X, U), 
       mcoa_vec(Y, V), 
       compare(C, U, V).

Surprisingly, no counter example against transitivity is found so far, though not tested extensively. But I am interested in
this experiment, which seems suggest that minimum coalgebra of a (rational) term represents some lexical ordering of rational terms. This is unexpected happening. I take time for a while to see what happened.

% ?- mcoa_vec(f(g(a), b), V). 
%@ V = m(f(2, 4), g(3), a, b). 
% ?- X=f(Y,0), Y=f(X,1), mcoa_vec(X, U), mcoa_vec(Y, V). 
%@ X = f(f(X, 1), 0), 
%@ Y = f(X, 1), 
%@ U = m(f(2, 4), f(1, 3), 1, 0), 
%@ V = m(f(2, 4), f(1, 3), 0, 1). 

(Nothing of importance)

No counter example against transitivity found for acyclic terms, though not extensively tested yet. I used your fuzzer with acyclic_term inserted for restriction to Herebrand term with time limit 600 seconds
EDIT
You may be right. I forgot to insert @< test. I check later soon.