Order and sorting

Disclaimer: I seem to have a preoccupation with ordering and sorting; I have managed to not let it influence my daily life too badly. I lived in Germany once but I left the country.


This topic pops up periodically. This time it is the library(ordsets) and multisets discussion. This is not an exclusive list.

Back from 2015, when discussing the then new library(solution_sequences), on the old forum:

On 02/13/2015 02:54 PM, Boris Vassilev wrote:

On Fri, Feb 13, 2015 at 3:45 PM, Jan Wielemaker <J.Wielemaker@vu.nl
mailto:[J.Wielemaker@vu.nl](mailto:J.Wielemaker@vu.nl)> wrote:

As defined, predsort/3 removes duplicates, so stable has no meaning …

You can trick it into doing a stable sort. Just unify Delta with <
when the keys are equivalent, instead of =. This relies on predsort/3
keeping the original order of pairs of elements when comparing them. The
current implementation of predsort/3 does that, but of course this is
not a requirement, so I know it should not be done.

Didn’t know that :slight_smile:

Cheers — Jan

I provided the trivial solution in this post:

equivalence_compare(Order, X, Y) :-
    compare(Order0, X, Y), 
    (   Order0 == '='
    ->  Order = '<'
    ;   Order0 = Order
    ).

At that time I had accepted that keeping the original order of equivalent elements is not in the specification of predsort/3. The following still holds true:

  • The predsort/3 implementation uses a merge sort.
  • A merge sort can be made stable in respect to the order of the input list without any additional logic (just don’t swap the order of the inputs unnecessarily).
  • In Prolog (and for linked lists in general), merge sort is a better algorithm than for example quicksort (which is not stable).

Both sets and multisets have a well-defined representation as a sorted list; the difference is whether it is the result of sort/2 (set) or msort/2 (multiset).

Right now I suspect that the widely accepted, “textbook” algorithms that work on Prolog “sets” are by definition correct also for “multisets”. It will take me some days or weeks to convince myself that this is true.

I am posting this in the hope that someone on the forum already knows that as a fact and can provide a proof, for example in the form of a reference to an external source; or correct my misunderstanding.

I am definitely not suggesting to change the spec or implementation of any library predicate.

1 Like

There are some predicates in it that deduplicate but keep
the input order. So they don’t use sort/2. For example distinct/1
predicate, which is more akin to JavaScript Set():

?- findall(X, distinct(member(X, [e,c,c,b,c,d])), L).
L = [e, c, b, d].

Natural sorting, using the natural order compare/3 and
either sort/2 or msort/2, has the advantage that the representation
has equality via (==)/2 and and you can also use (=)/2,

for both set and multiset semantics. With JavaScript Set()
since SameValue is only object identity, if you have atomic
elements, one can call !isDisjointFrom() to check equality:

Welcome to Node.js v25.8.1.
Type ".help" for more information.
> x = new Set([3,1,2])
Set(3) { 3, 1, 2 }
> y = new Set([1,3,2])
Set(3) { 1, 3, 2 }
> Object.is(x, y)
false
> !x.isDisjointFrom(y)
true

Unfortunately ISO core standard has only added sort/2, and
lacks msort/2. One way is to bootstrap it via predsort/3 as
you showed. But predsort/3 is neither in the ISO core standard.

An other way to bootstrap it, is to use keysort/2:

my_msort(X, T) :-
   my_dummy_values(X, Y),
   keysort(Y, Z),
   my_dummy_values(T, Z).

my_dummy_values([], []) :- !.
my_dummy_values([X|L], [X-dummy|R]) :- 
   my_dummy_values(L, R).

Seems to work:

?- my_msort([e,c,c,b,c,d], X).
X = [b, c, c, c, d, e].

keysort/2 can be implemented with stable sorting and
a key comparator under the hood, but you can also implement
it with sorted maps that have key values pairs where

the values are list, paying extra effort to pack and unpack.

I went on wikipedia to remind myself of the meaning of a “stable sort” and I found this gem:

Stability: stable sorting algorithms maintain the relative order of records with equal keys (i.e., values).

Keys, i.e. values. Oh, wikipedia.

To be fair, the sorting algorithm sorts the keys for key-sort of key-value pairs but sorts values for a list of values. :slight_smile: (The latter is something of a tautology)

This is a matter of terminology. I will now write a quote that I can’t be bothered to cite but it should give you an idea:

A total ordering is a weak ordering, and a key function f on a set T together with a total ordering r on the codomain of f, define a weak ordering wr(x, y) \Leftrightarrow r(f(x), f(y)).

“Codomain” is apparently the domain of the function return value.

The way I understand this is that if a list of Things can be sorted (somehow), this also means that you can define a function that takes any one Thing and returns a key that has a well-defined order.

(Edit: “sorted somehow” is of course hand-waiving. What it really means is that there is a relation function, wr above, which is defined for any pair of elements. Relation here means something like “smaller than”.)

One consequence of the above is that if you can define a comparison predicate and give it to predsort/3, you can achieve the exact same result by applying a function to each of your elements (values), sort on the key with keysort/2, and retrieve the sorted elements.

Python does that:

functools.cmp_to_key(*func*)

But you might overestimate Prologs keysort/2, you cannot supply
objects to keysort/2 in standard Prolog. It only sorts with keys which
are Prolog terms with their natural order.

On the other hand Python can synthesize little objects, and set or
overwrite __equal__ and __less__ callbacks, or whatever name they
have, from the supplied comparator func, creating collation keys.

So predsort/3 in standard Prolog, unlike Python allows, cannot be
automatically bootstrapped from keysort/2. I have nowhere seen
such a bootstrapping for standard Prolog, and I think it is not

possible to realize. But msort/2 can be bootstrapped from keysort/2,
as shown with my_msort/2, since it still uses natural order. And you
can nevertheless switch from predsort/3 to keysort/2, on a case by

case basis, if you know a collator key for your comparator.

I am not sure how this matters, because I am not sure what you mean when you say object. Do you mean a thing which has a lifetime and a state?

Doesn’t cmp_to_key() do the exact opposite? It is not cmp_FROM_key… You provide a key function (the f above) and you use that key to define the relation (the wr above) so that you can pass it to a predsort/3-like sorted()? This is trivial to implement it seems.

When you say “bootstrapped” I assume you mean “predsort defined in terms of keysort”? But keysort/2 is just a special case of sort with a predefined key function. So, you can define keysort in terms of predsort:

my_keysort(Pairs, Sorted) :-
    predsort(key_cmp, Pairs, Sorted).
key_cmp(Order, A-_, B-_) :-
    compare(Order0, A, B),
    (   Order0 == '='
    ->  Order = '<'
    ;   Order = Order0
    ).

Seems to work but I’d be glad to be proven wrong:

104 ?- my_keysort([z-bar, a-foo, a-bar], Sorted).
Sorted = [a-foo, a-bar, z-bar].

[EDIT] I agree that generating the key function from the comparison function/predicate is non-trivial. I have to think about it for a moment.

The phrase boostrap predsort/2 from keysort/2,
means predsort in terms of keysort and not vice versa,
as you assumed, keysort in terms of predsort.

This is unlike Python, where functools.cmp_to_key returns
a little key function. Which creates on the fly objects with the
following state and behaviour:

  • State:
    The particular key function from functools.cmp_to_key
    creates an object that is just a wrapper to the given value.

  • Behaviour:
    The particular key function from functools.cmp_to_key
    defines an object that compares to other objects, by
    extracing the wrapped value and apply func between them.

Disadvantage of the Python approach: Its awfully slow,
since you have to apply this key function. So if you sort
1 million objects (1’000’000 objects) with a custom comparator,

it will first create 1’000’000 wrappers, or even create more
obejcts on the fly, who knows. So its a very bad design.
Now back to Prolog, trying to bootstrap predsort/2 from

keysort/2, making use of pairs_keys_values/3 and pairs_values/2.

predsort(C, L, R) :-
   cmp_to_key(C, F),
   maplist(F, L, K),
   pairs_keys_values(H, K, L),
   keysort(H, J),
   pairs_values(J, R).

This is impossible in my opinion for standard Prolog, because you
have no way to realize a generic key function F in Prolog from
a given comparator C.

OK, I now understand what you mean. But take a look at this. To me it seems like they are cheating :smiley: this is not the same as defining predsort/3 in terms of keysort/2, right?

[EDIT] I see your edits. Yes, this is not the same. I will still have to think for a moment for bootstrapping predsort/3 from keysort/2.

Its just behaviour by inner functions. There are no inner functions
in standard Prolog, not to speak of objects, there are no objects
in standard Prolog. That it defines an object, is seen by the use

of conventional self, so all inner functions are methods. The
interesting thing is that Python has inner classes (sic!) . This is
used to have a reference to the given comparator mycmp.

So ultimately when it returns the inner class K, it returns the
constructor for instances of this inner class. This creates a big
bottelneck in Python, making sorting slow, since instances are

created. Very slow in the end. But elegant since only sort()
and cmp_to_key() is realized. Not sort() and predsort().

Yes, I think I understand how Python works (but then again who knows :smiley: )

I am not sure why would there be “no objects in standard Prolog”. It really depends on your definition of object. Just because it doesn’t use the same techniques as Java or Python doesn’t mean you cannot achieve the same outcomes.

Nothing stops you from having state and behavior embedded within compound terms. It is quite useless but surely not difficult?

Well some objects that would be undestood by
sort/2 or keysort/2. ISO core standard defines sort/2 or
keysort/2 based on compare/3, respectively the

term-preceeds relationship. It doesn’t define hey
take the things you get, and call:

        def __lt__(self, other):
        def __gt__(self, other):
        def __eq__(self, other):
        def __le__(self, other):
        def __ge__(self, other):

I didn’t say you cannot model objects in standard Prolog.
But sort/2 and keysort/2 are black boxes, you cannot
change the behaviour, they just use the natural order.

Yes, but this is most definitely not what the definition of cmp_to_key() does. It does not generate a key function, it calls the comparison function when the keys are compared, unless I am completely misreading this???

Yes, fully agree. There is no misunderstanding here on my side at least.

Please don’t use bold, my cat thinks there is a mouse on
the screen. It is defined as such by Python, I didn’t invent
the notion of key functions.

You first go here:

functools.cmp_to_key(func )
Transform an old-style comparison function to a key function.
https://docs.python.org/3/library/functools.html

And then follow the link to here:

key function
A key function or collation function is a callable that
returns a value used for sorting or ordering.
https://docs.python.org/3/glossary.html#term-key-function

I use bold for headings only, or important items. But
not for exclamations, I am not deaf. That it is a callable,
matches nicely maplist/3 here:

The language used by Python also makes me think
whether Python is not simply an object oriented Prolog?

Well well I thought we live in a free country. I use bold differently from you but I can surely accommodate you. If I want emphasis, is italics OK with you?

Yes, I already did those things. I am starting to think we are getting stuck in semantics here.

Well OK is bad, its captilized! Thats definitively
screaming in usual nettiquette.

my oh my. ok then, i drop the capitals too. i used to think that it’s ok to type ok with two big letters. in fact, when i type it with small letters the over-eager spell-checker on my browser underlines it red but this doesn’t bother me too much.

we drifted veeery far off-topic (i hope letter repeats are acceptable) and i am afraid we might get a warning. i at least really appreciate the discussion so far.

just to get this straight: it seems to me that cmp_to_key() takes the comparison function and applies it whenever the sort() needs to compare two keys. this seems different from taking a comparison function and somehow generating a key function from it. it feels as if this happens but “under the hood” it is just applying the comparison function. unless of course i am misreading the code.

I was also taking a deep breadth when I read key function
is a callable today. The struggle is real, the Python thingy
semantics is such that it makes like a dozen of indirections.

While a predsort/3 is more easier to understand. You can
just capitalize, not all caps, but the first letter, and you
understand whats going on. Like sort/2 implementation,

for example quicksort:

sort(L, R) :-
     ...
     compare(C, X, Y)
     ...

And then predsort/3 implementation:

predsort(Compare, L, R) :-
     ...
     call(Compare, C, X, Y)
     ...

I think this abstraction by “function pointers” has a brighter
future, than the object-orientented-ness of Python. But
“function pointers” are often frowned upon as well.

1 Like

BTW, the Python wrapper and Python sort do something like this:

pysort(L, R) :-
     ...
     X = 'K'(F,A), Y = 'K'(_,B), call(F, C, A, B),
     ...

Was modelling the class K instances as compounds 'K'/2.

With keys that have compareTo() callbacks there is always
an asymmetry, since one object is active and decides what
comparison to apply, providing both state and behaviour,

while the other object is rather passive only provides state.