On sorting non-ground terms and the standard order of terms

Merry Christmas everyone,

When a list is sorted by sort/[2,4] and friends and according to the standard
order of terms, non-ground terms that are alphabetic variants of each other
are sorted according to the age of their variables.

For instance, if I sort the list Ls = [p(A,B), p(C,D), p(E,F)] the result will
be Ls itself, because although the three terms in Ls are identical up to
renaming of variables, their variables have different ages. If the list
containted three instances of the term p(A,B) instead, it would be sorted to
discard the duplicates:

?- _Ls = [p(A,B), p(C,D), p(E,F)], sort(_Ls, Ss).
Ss = [p(A, B), p(C, D), p(E, F)].

?- _Ls = [p(A,B), p(A,B), p(A,B)], sort(_Ls, Ss).
Ss = [p(A, B)].

I don’t mean to say that my intuition is correct, but I have always found that
very counterintuitive. In my mind, if I sort a list with non ground terms and
it’s true that, for some pair of terms {T1, T2}, T1 = T2 (or even
unifiable(T1,T2,U) and U \= []) then I expect to see only T1 or T2 in the
sorted list, or perhaps a new term T3 that is the result of unification of T1
and T2.

I understand why my expectation is wrong. The thing is, sometimes I really
need to sort a list to get rid of alphabetic variants of non-ground terms. In
that case, I resort to what feels like hackery, of the form, e.g.:

,varnumbers(Ts_, Ts)

But, like I say, that sounds hacky and I’m also concerned about the overhead
of skolemising everything.

Note that I expect any terms that unify to be treated as identical. So for
instance, I am also surprised when, say, p(A,B) and p(a,B) are sorted as
different terms.

This has bit me multiple times in the last few weeks (months, even) and I
wanted to hear what people think.

To clarify- I’m not raising a complaint of any sort. It’s just something that
I still find very confusing and like I say I keep bumping my head on it all
the time.

Thanks for your help!

I think you could “roll your own” to get what you are after. With the following comparing predicate:

unify_compare(Delta, A, B) :-
    (   unifiable(A, B, _)
    ->  A = B, Delta = (=)
    ;   compare(Delta, A, B)

… and using predsort/3:

?- predsort(unify_compare, [p(A,B), p(C,D), p(E,F)], Sorted).
A = C, C = E,
B = D, D = F,
Sorted = [p(E, F)].

?- predsort(unify_compare, [p(A,B), p(a, B)], Sorted).
A = a,
Sorted = [p(a, B)].

But, as you see, now you have unified the variables. You could also not unify inside your comparison predicate:

unifiable_compare(Delta, A, B) :-
    (   unifiable(A, B, _)
    ->  Delta = (=)
    ;   compare(Delta, A, B)

And now:

?- predsort(unifiable_compare, [p(A,B), p(C,D), p(E,F)], Sorted).
Sorted = [p(A, B)].

?- predsort(unifiable_compare, [p(A,B), p(a, B)], Sorted).
Sorted = [p(A, B)].

But now I am really not sure what this is supposed to mean :slight_smile: Which term would you like to see in the “sorted” list? p(a, B) or p(A, B)? The current definition of predsort/3 takes the first element if they are equivalent, so changing the order of the original list will change the result:

?- predsort(unifiable_compare, [p(a, B), p(A, B)], Sorted).
Sorted = [p(a, B)].

What would your intuition do with:

?- _Ls = [p(A,B), p(C,D), p(E,F)], sort(_Ls, Ss), A=1, B=2, C=3, D=4, E=5, F=6.
_Ls = Ss, Ss = [p(1, 2), p(3, 4), p(5, 6)]


?- _Ls = [p(A,B), p(C,D), p(E,F)], sort(_Ls, Ss), A=1, B=2, C=A, D=B, E=A, F=B.
_Ls = Ss, Ss = [p(1, 2), p(1, 2), p(1, 2)]


?- _Ls = [p(A,B), p(C,D), p(E,F)], sort(_Ls, Ss), A=6, B=5, C=4, D=3, E=2, F=1.
_Ls = Ss, Ss = [p(6, 5), p(4, 3), p(2, 1)]

Thanks, I hadn’t noticed predsort/3 before. I usually go directly to sort/4 to remind myself of the strange symbols.

But now I am really not sure what this is supposed to mean :slight_smile: Which term would you like to see in the “sorted” list? ``

I’m guessing, if alphabetic variants are considered identical you can just take the one that is found first in the list. I think, as long as it’s clearly explained what the predicate does it’s OK if its semantics are a bit hairy. I’d just have to be careful when documenting the code.

Edit: My latest confusion was in the context of trying to use library(ordsets) to take “set” intersections and differences (e.g. with ord_intersect/3 etc). For that purpose, it doesn’t really matter which term comes first because each “set” should already be sorted and without duplicates. Unfortunately, when two terms that are alphabetic variants are in different sets, things become more complicated. The library(list) set operations (intersection/3 etc) perform unification but they don’t assume ordering so they’re expensive. I’ll probably have to bodge something together by copying bits off library(ordset) and replacing compare with something like your unifiable_compare/3 predicate to get what I want.

I don’t know. Sorry, it’s a bit late and my brain is not working very well. What do your examples mean?

My point is: if you sort a list with variables, and two of the list item can unify, then it’s not clear what the result should be. A subsequent unification could cause the list to remain sorted, or the unification could make the list unsorted (or contain duplicates). So, for sort/2 to be always meaningful, the list elements need to be “sufficiently” ground (they can contain variables, but no all the list elements should be pair-wise non-unifiable).

I just want to add a bit more confusion. ISO/IEC 13211-1 says:

7.2.1 Variable

If X and Y are variables which are not identical then X term_precedes Y shall be implementation dependent except that during the creation of a sorted list (, j) the ordering shall remain constant.

Corrigendum 2 from 2012, which adds sort/2 to the standard, gives this interesting example (last paragraph of

    Succeeds unifying L with [U,V,f(U),f(V)] or
    [The solution is implementation dependent.]

Obviously, most implementations like SWI-Prolog will be kind enough not to change the order every other call, but I think it’s a good idea not to depend on any particular order in your code.

If I understand correctly (now that I had a good night’s sleep) you mean that sorting in the standard order of terms is not well-defined for lists that include alphabetic variants.

If so, I agree and I think this is one thing that confuses me, because I’m not sure what the result should be of sorting a list with alphabetic variants. I generally want to treat unifiable terms as identical, but that may mean I have to roll my own comparison predicate in some situations.

@michaelby Yes, that could be very confusing.

In other words, don’t do that. Only sort lists of sufficiently ground terms; or re-sort them after the terms have become sufficiently ground.

(I am curious why you would want to “treat unifiable terms as identical”, and how your code might be changed so that sorting doesn’t surprise you.)

1 Like

Weeell, if you’re curious then it’s your fault :slight_smile:

The specific use case that’s been troubling me in the past few days is an
evaluation module for an Inductive Logic Prorgramming algorithm. The algorithm
learns Prolog programs from examples and background knowledge and the
evaluation module measures the accuracy of the learned program with respect to
a partition of the examples unseen during training (the “testing” partition).

To evaluate a learned program the evaluation module generates its success set.
There are two options for this that are selected by a configuration option in
the application: one is to generate the program’s Least Herbrand Model by
bottom-up evaluation with a TP operator; the other is to resolve the learned
program against the background knowledge nondeterministically and collect all

In both cases the program is evaluated (as in executed) in the context of the
background knowledge, because the purpose of learning is to find a program
that entails the positive examples and no negative examples given the
background knowledge or, in typical notation, B ∪ H |= E⁺ and for all e⁻ in
E⁻, B ∪ H \|= e⁻, where E⁺ the positive, E⁻ negative, examples, B the
background knowledge and H the learned program -the “hypothesis”- and “|=” and
\|=” are my best attempts at ASCII for “entails” and “does not entail”.

To measure the accuracy of H, the evaluation module performs set operations on
the success set of H, SS(H), and the positive and negative examples. So, for
example, to measure the rate of false positives (a component of accuracy) the
evaluation module takes the set intersection of SS(H) and E⁻. So a “false
positive” is an atom of the predicate of H, that is in the intersection of
the set of negative examples and the success set of H. And so on for false
negatives, true positives and true negatives.

For reference, the following is the current bit of code that measures false
positives. The ground/1 check is to avoid the issues I’ve been discussing.
Rs is the success-set, Neg is the negative examples and NP are the false


The problems begin when the background knowledge a) is non-ground, b) cannot
be ground, or c) is not datalog. By “datalog” I mean Horn clauses with no
function symbols of arity more than 0 and with no literals negated by negation
as failure. If (a), (b), or (c), the TP operator can’t be used or will return
nonsensical results, so that option is out.

The second option, using ordinary SLD resolution to generate SS(H) does not
care about grounding, but that doesn’t mean it will always generate ground
terms. One recent experiment that’s been giving me trouble is learning a DCG
from a set of example strings (taken from a language used in a popular card
game). The background knowledge is also a DCG. Note that examples are always
ground. In fact, examples must be ground for the algorithm to work as

Here is an example string, generated by calling phrase/2 on a “true” grammar
(hey, it’s an experiment):


Here’s a clause in the learned grammar covering that example:


And here’s the relevant atom of the success set of ability/2:


This should explain the ground/1 check in my false_negatives/3, above.
ord_intersection/3 will consider the example string and the success set atom
above to be distinct and it will not add them to the intersection of SS(H) and
examples. intersection/3 will unify them and add them to the intersection.

However, intersection/3 expects no duplicates and generating SS(H) cannot be
guaranteed to not generate any duplicates (e.g. two DCG rules may entail the
same examples) so SS(H) must be sorted to remove duplicates. Below is the bit
in my code that currently does the generation and grounding:

	,S = (table(user:F/A)
	,G = (findall(H
	     ,sort(Rs_, Rs_S)
	     ,varnumbers(Rs_S, Rs)
	,C = (erase_program_clauses(Refs_Ps)

Above, F/A is the predicate of H, Ps is the learned program, _BK is
the background knowledge, used when the “tp” option is selected, and Rs is

(Note the code actually uses SLG resolution to avoid going infinite when the
hypothesis is left-recursive.)

While it is possible to twist things into loops and ground the background
knowledge and the success set of the learned grammar, in this case, I
personally don’t see how this can be done in a general enough manner that will
work for different datasets where non-ground terms crop up for different
reasons. My preferred method to ground terms is to Skolemise them with
numbervars/1 but, for example, in the case of the DCG learning experiment that
would cause false_negatives/3 to fail to return an empty intersection of
SS(H) and examples ('$VAR'(1) in ground H atoms would not unify with
[] in example strings).

My code is on github, you can have a look if you’re still curious:

The evaluation module is in lib/evaluation/evaluation.pl. The dataset is in
data/examples/mtg_fragment.pl Have fun and please report bugs :slight_smile:

1 Like

While I find what you are doing interesting, about half of it is new to me, e.g. Least Herbrand Model and TP operator. When I read your problem the idea that keeps popping into my head is that you should give types to your variables, and then sort. OK so how would you go about giving types to variables that are suppose to be typeless? Use Hindley–Milner type system. AFAIK HM doesn’t have to give concrete types, it just has to infer that different arguments have different types even if those types are not concrete, and the types are consistently used. In functional languages the types can even be functions themselves so with your system, why can the types not be predicates? Don’t know myself, just thinking out loud.

I have not found the Hindley-Milner type system implemented in SWI-Prolog but did find Package “type_check” but as I have noted before have received no response from an e-mail to the author.

Don’t know if the idea of using HM will help you, but if I didn’t mention it you might not have considered it. :smiley:

If you need to see the code for package type_check then

  1. Go to package page: type_check package
  2. Locate Prolog source code, e.g. type_check.pl, and click on file name.
  3. On page that appears mostly empty, click on image (link)

Hi Eric,

I wouldn’t dismiss using types out of hand- it’s an interesting suggestion! Thanks.

Apologies for the use of unfamiliar terms. A Least Herbrand Model is (very crudely) what you get when you call a predicate nondeterministically at the Prolog top-level with all variables unbound. The TP operator (or “immediate consequences operator”) is used for bottom-up evaluation in Datalog.

A lot of it is quite new to me too, but I had to learn it all quickly when my supervisor made an off-hand comment like, “why don’t you use a TP operator”? :slight_smile:

Some of it is explained in this handy reference:

There’s plenty more in Lloyd’s book on Logic Programming (Foundations of Logic Programming, J. W. Lloyd 2nd ed). (WorldCat)


1 Like

So in reading more of what you have here, in the references and GitHub, another idea has popped into my head. I don’t even know if this one relates to what you are doing so this is more of a question than a straight up suggestion:

Knuth–Bendix completion algorithm

Does this relate to what you are doing? Not expecting an answer but would be nice to know if my thoughts are in the ball park given that half of this is still a blur.

Also notice the partial similarity of rules



Syntactic unification


Hi Eric,

Ah, I see you’re already down the rabbit hole having fun :slight_smile:

I confess to not be very familiar with the Knuth-Bendix algorithm itself.
Although if I understand correctly you ended up reading about it after finding
it in one of the references on my project’s github, the algorithm as such is
not used in the papers I cited. Rather, a Knuth & Bendix style lexicographic
ordering of the set of predicate symbols in a domain (the “predicate
signature”) is used in Metagol (the algorithm described in the papers I cite)
to avoid attempts to prove left-recursive clauses during learning. Learning in
Metagol is a proof, performed by a Prolog meta-interpreter, so basically
trying to prove left-recursive clauses would cause the meta-interpreter to
enter an infinite recursion. A second kind of total ordering, this time over
the set of constants in the domain, what we call an interval inclusion
ordering (taken from a 2005 paper, by Zhang and Manna, cited in the same
reference that cites Knuth & Bendix) is used to allow the proof of
left-recursive clauses while guaranteeing the termination of the proof by
precluding an infinitely descending evlauation of left-recursive (and
mutually recursive) calls.

Incidentally, one reason I’m very enthusiastic about Swi-Prolog’s
implementation of SLG resolution is that it means my own project (a new MIL
algorithm and system) doesn’t have to bother with total orderings of the
predicate and constant signature to guarantee termination when evaluating
left-recursive clauses. I consider this a good thing because it alleviates the
burden on the user to provide a total ordering, plus ordering constraints,
which I always felt is a lot to ask on top of providing background knowledge
for every learning problem.

But I digress. If I understand correctly (wikipedia is my source here) the
Knuth-Bendix algorithm can be used to determine the equality of terms in
arbitrary algebras (although the problem is apparently undecidable in the
general case). This does indeed sound like a more general case of the problem
I’ve been having and the similarities between the steps of the algorithm and
Robinson’s unification algorithm that you point out are interesting. However,
I think that for my particular use case, sorting Prolog lists and performing
“set operations” on Prolog lists, unification should be enough to decide
whether two Prolog “terms” are equal.

So my problem is what to do with a list that represents the elements of a set
and that may include terms that unify (so they are “alphabetic variants”). My
instinct is to treat those as “equal” for the purposes of sorting, but then
there’s a second decision that has to be made, in how to chose one of those
“equal” terms for inclusion in the “set” resulting from the sorting of the
list (as Boris and Peter pointed out). My instinct here, too, is to recognise
that a set (or a “bag”) of alphabetic variants can be ordered according to
generality under subsumption, so the trick is to decide whether the most
general or the most specific term should be kept in the sorted “set”. For my
specific use case, of the evaluation module I described above, I think the
most general term should be kept. This is so that we can then define set
operations, between Prolog “sets” that may include alphabetic variants
across two sets, that also take into account unification. In the end we’d be
comparing “sets” with the most general elements possible and considering them
equal if they unify, which is what we want to happen when evaluating possibly
non-ground results of a program against ground examples.

Like I say, all this is only in the context of my evaluation module and it
wouldn’t make sense to extend it more broadly to sorting and set operations
for general cases. But, perhaps it’s possible to have list sorting and set
operations with an option to ignore the standard order of terms and treat
alphabetic variants as equal or not under unification.

Note also that I think that the standard order of terms is a very pragmatic
concept that should help avoid such confusion in most cases. I’m just very
prone to being confused :slight_smile:

Not expecting an answer but would be nice to know if my thoughts are in the
ball park given that half of this is still a blur.

Aw. Why are you not expecting an answer? Did I sound so mean in my earlier

1 Like

I went down the rabbit hole way to long ago (decades), but it is definitely fun there.

No, actually found it many years ago when trying to solve another problem. It is one of those things that once you learn about, you never forget.

While I am familiar with proof systems (Coq), I have yet to actually work with them to a level I would like, but the concepts and burdens you noted are what keep me from adding them to my tool box because of the steep learning curve.

Every time that part of the problem enters my mind, the axiom of choice comes to mind.

Confusion is also my companion whom with serendipity will introduce me to ideas. So having one without the other makes life dull.

I didn’t know how far the cognitive distance was between Unification and Knuth-Bendix for your work, and if it was too great, an answer would have required more time than necessary.

It is like when I see some questions on StackOverflow and the OP needs a full class in programming just to explain Hello, World!; just not something I can devote the time to answer.

Not at all. It is actually nice to converse with other that know about areas I partially know about so that I can learn more.

While what you are working on is past my ability to give precise help, one place that actively helps with these type of questions and where I would ask if this were my problem is https://cs.stackexchange.com/. This is not the well known StackOverflow site, but the sister site where many of the more Computer Science questions are posed as opposed to programming questions and typically answered by university professors and PhD students.

I don’t know if you read further in the Wikipedia article on Unification, but they have a section Order-sorted unification which looks a lot like your problem and links to Many-sorted logic.

However the reason I am noting this is that IIRC one of the proof systems has the concept of a sort built-in, but I can not find which one. I never really understood why they called the concept sort even though they explained it, but in reading about your problem it makes more sense.

Perhaps the examples in Many-sorted logic could serve as test cases.

@EricGT Minimal models and Tp are defined on pp. 104 and 105 of The Art of Prolog (2nd ed.) if you’re looking for an open access reference.

@stassa.p If you’re still looking for a solution, maybe we could play an ILP game? Give us the predicate from the Louise code you’re not happy with (is it false_positives/3?) along with a negative example of behavior you don’t want and a positive example of what you do want, and we can see if we can “induce” the right definition. There are a lot of tricks one can do with double negation when it comes to computing with unground terms, but it’s not yet clear (to me, at least) what you are looking for here.


Hi Michael,

That sounds like a fun game indeed, however keep in mind that partial
specifications by example can be very inaccurate. Most likely, I’ll end up
giving some examples that I think are correct, you’ll come up with some edge
cases I hadn’t considered, then I’ll realise there’s a better example and so

Also, the predicates I’m unhappy about are actually four: true_positives/3,
false_positives/3, true_negatives/3 and false_negatives/3. true_positives/3 and
false_positives/3 are defined in terms of ord_intersection/3 and intersection/3
whereas true_negatives/3 and false_negatives/3 are defined in terms of
ord_subtract/3 and subtract/3. They are similar enough otherwise, so I’ll just
show the definition of true_positives/3, which is as follows:


Note that this actually works fairly well unless you remove the ground/1 check.
Now, below is a positive example, followed by a negative example:

% I give the arguments separately for ease of reading:
_Rs = [ ability([destroy,target,'Serpent'|_A],_A), ability([exile,all,'Spellshapers'|_B],_B), ability([exile,target,'Leviathan'|_C],_C), ability([return,an,artifact,to,'its\'','owner\'s',hand|_D],_D)]
_Pos = [ ability([destroy,target,'Serpent'],[]), ability([exile,all,'Spellshapers'],[]), ability([exile,target,'Leviathan'],[]), ability([return,an,artifact,to,'its\'','owner\'s',hand],[])]

% Positive example
E_plus = true_positives(_Rs,_Pos,_Rs).
% Negative example
E_minus = true_positives(_Rs,_Pos,[]).

Like I say, this currently works fine, for example:

?- _Rs = [ ability([destroy,target,'Serpent'|_A],_A), ability([exile,all,'Spellshapers'|_B],_B), ability([exile,target,'Leviathan'|_C],_C), ability([return,an,artifact,to,'its\'','owner\'s',hand|_D],_D)].

?- _Pos = [ ability([destroy,target,'Serpent'],[]), ability([exile,all,'Spellshapers'],[]), ability([exile,target,'Leviathan'],[]), ability([return,an,artifact,to,'its\'','owner\'s',hand],[])].

?- true_positives($_Rs,$_Pos,$_Rs).

?- true_positives($_Rs,$_Pos,[]).

However, I’d like to avoid the ground/1 check. I’d be happy to hear your suggestions for this.

Have a Happy New Year!

Re many-sorted logic and order-sorted unification in particular. That’s interesting. I should probably check it out.

One question is how such formalisms translate into Prolog in practice. But let me have a look first to see if I can understand the concepts first. Thanks for the pointers! :slight_smile:

Have a Happy New Year,