EDIT 2025/8/24
I found difficuly in icl_compare/3
on transitivity. The following
might explain the difficulty. Suppose four terms 1, 2, 3, 3’
1 A = f(a, b, c)
2 B = f(a, d, e)
3 C = f(a, d, f)
3’ C’= f(a, d’,f)
by 1,2 we have A < B
by 2,3 B < C
by 1,3 A < C
However, if d and d’ are “incomparable” like Mats Carlsson’s pair, we can not have A<C’ from A<B and B<C’.
If isomorphic terms share unique index, then transitivity works exactly like that of finite terms. Details will be posted elsewhere in the near future. Fortunately, using ==/2, it is quite easy to give such unique index. In fact, I already posted using minimum coalgebra. I will post revised version with minimum coalgebra generated by ==/2
, not same_term/2
.
Original Post
icl-compare.pl (1.9 KB)
I am posting a draft of icl_compare/3
.
icl(-C, +X, +Y) is det.
Given rational terms X, Y
,
C
is unified with =
, <
, or >
as follows:
=
if X == Y
(isomorphic)
<
if Y
is greater than X
in lexicographic order
>
otherwise
icl_compare/3
is meant as an extension of the standard compare/3
on finite terms. It tries to provide a total linear ordering on rational trees, i.e. reflexive, antisymmetric, and transitive. I believe it can be used as the compare
predicate in predsort/3
.
Roughly speaking, given a pair of rational terms X, Y
,
icl_compare(C, X, Y)
behaves as if it builds a bisimulation starting from X
and Y
, and walks on both trees in parallel in a depth-first, left-to-right manner. Since X
and Y
are assumed rational, each path has finite depth, so the overall walk is finite. This seems to guarantee termination for any X, Y
. I think these two properties form the basis for correctness, in particular for transitivity, though I may be missing something.
For guaranteeing that each branch depth decreases, the ordering on branches turned out to be critical. I tried to note some details on that part.
The uploaded version includes a small change, which looks tiny but I believe is essential for keeping depths finite.
It still feels a bit strange to me that no counterexamples for transitivity appeared in the older version, even though I ran a fuzzer with a time limit of 2,000 seconds. In earlier buggy versions, counterexamples used to show up within 10 seconds. Although I still have this unanswered question in mind, I decided to post the current version as “final” for now.
It relies heavily on ==/2
and otherwise works just like the standard compare/3
, without any preprocessing step.
Note on ordering on sub-term sequences.
\def\arg{\mathrm{arg}}
\def\prefix{\mathrm{prefix}}
\def\cycle{\mathrm{cycle}}
If A is a compound term of Prolog, \arg(i, A) denotes i-th argument of A. For example if A = f(a,g(b),c) then \arg(1, A)=a, \arg(2,A)= g(b), \arg(3,A)=c. B is called a subterm of A if B=\arg(i, A) for some i.
A sequence \gamma:= A_1, A_2, \ldots, A_n (n>1)
is called a subterm sequence if A_{i+1} is a subterm of A_i (1\leq i<n). n is called the length of the sequence, and written
n=|\gamma|=|A_1, A_2, \ldots, A_n|.
Let \Gamma be the set of subterm sequences
\gamma := A_1, A_2, \ldots, A_n such that A_i\neq A_j (i\not= j) for 1\leq i, j<n and A_k=A_n for some 1\leq k<n.
The subsequence A_i,A_{i+1}\ldots,A_n is called the cycle of the sequence, and written \cycle(\gamma). i is called the start of the cycle of \gamma, and n-i+1 is called the length of the cycle of the sequence. The subsequence A_1, A_2, \ldots, A_{i-1} (i>1) is called the prefix of \gamma, and is written \prefix(\gamma). When i=1, by convention, \prefix(\gamma)=\varepsilon (empty sequence), and the length of \varepsilon is defined to be 0.
For example, suppose \gamma is an sequence \gamma:=A, B, C, D, C
in \Gamma, where A, B, C, D are distinct. Then |\gamma|=5, the start of the cycle of \gamma is 3, |\cycle(\gamma)|=2.
For \gamma, \delta \in \Gamma, we define \gamma < \delta to mean that the following conditions holds:
- |\cycle(\gamma)| < |\cycle(\delta)|, or
- |\cycle(\gamma)| = |\cycle(\delta)| and the start of \cycle(\gamma) is less than that of \cycle(\delta).
We define \gamma\sim\delta to mean that neither \gamma<\delta nor \delta<\gamma holds.
It should be clear that for \gamma, \delta\in\Gamma, exactly one of the following holds.
- \gamma<\delta.
- \delta<\gamma.
- \gamma\sim\delta.
Also clearly it follows from \gamma\sim\delta that |\gamma|=|\delta|, |\cycle(\gamma)|=|\cycle(\delta)|, and |\prefix(\gamma)|=|\prefix(\delta)|.