Order and sorting

From the other thread:

Not necessarily. Multisets should be able to contain equivalent elements that do not compare equal. Equivalence is of course meaningless for compare/3; but we have a precedent in keysort/2, where two pairs are equivalent if the keys are equal. Same with sort/4, so we can do:

?- keysort([x-y,a-foo,a-bar], R).
R = [a-foo, a-bar, x-y].

?- sort(1, @=<, [x-y,a-foo,a-bar], R).
R = [a-foo, a-bar, x-y].

?- sort([x-y,a-foo,a-bar], R).
R = [a-bar, a-foo, x-y]. % this flipped foo with bar!

This cannot be done if equivalent subsets are represented as a term and a count.

But it also cannot be done easily if the set predicates are implemented on top of compare/3.

For this, the user should be able to provide a comparison predicate; there is a precedent for this, too, in predsort/3.

Currently predsort/3 needs a first argument Pred:

Pred(-Delta, +E1, +E2) . This call must unify Delta with one of <, > or =.

Side-note: some existing algorithms on sorted sequences only require a semi-det relation predicate p/2 (usually, “less than”). The comparison can be bootstrapped from it:

compare_p(Order, X, Y) :-
    (   p(X, Y)
    ->  Order = (<)
    ;   p(Y, X)
    ->  Order = (>)
    ;   Order = (=)
    ).

Either way, I tried to address my own suspicion from the original post:

Right now I suspect that the widely accepted, “textbook” algorithms that work on Prolog “sets” are by definition correct also for “multisets”.

This was not a useful way to formulate it. Instead:

An algorithm that is correct for multisets represented as ordered lists with repeats will also be correct for sets represented as lists without repeats.

I won’t prove that, it follows from the fact that sets in that representation are a special case of multisets. (ie, repeats are at most 1 long…)

Here it got interesting! @stassa.p already observed that the current implementation “just works”. I went ahead and did the following:

  • I copied from here some of the algorithms that work on sorted or partitioned ranges, and translated them to Prolog.
  • For the algorithms that were implemented in library(ordsets) already:
    • I compared the existing implementation;
    • I also tested the existing implementation along with my own (mechanic) translation.

I am attaching two files:
msorted.pl (6.6 KB)
test_msorted.pl (6.3 KB)

msorted.pl contains the algorithms I copied, and test_msorted.pl contains tests for some of the predicates and the corresponding library(ordsets) predicates. You can do for example

?- use_module(test_msorted).
true.

?- run_tests.

Some results:

  • The library(ordsets) implementation of union, intersection, difference (subtract), and symmetric difference (symdiff) is very close to the mechanical translation I did from procedural code to Prolog.
  • The implementation in library(ordsets) seems to be correct for multisets, and so are my mechanical translations in msorted.pl. Some of the implementations in library(ordsets) however do not follow the strict specifications for the msorted_* predicates (see the file msorted.pl for details).

All of this is in practice useless :smiley: however, I can now allow the user to provide the comparison predicates to the algorithms in msorted.pl. For example, with this:

key_compare(Order, A-_, B-_) :- compare(Order, A, B).

… the msorted_* predicates will be correct for multisets, as produced by keysort/2.

Any thoughts? Sorry for just pasting the files here, I am not sure about the names of things so I would like to see if there is any interest or comments before proceeding.