How to check for unique "primary keys"

I wanted to verify that some predicates have their “primary keys” (in the SQL sense) set up properly … this is the result of 15 minutes of work (with some more time, I could make it a bit easier to use, but I’m lazy). It turns out that subtract/3 doesn’t do quite what I want (it assumes unordered sets but I wanted unordered multisets), so I rolled my own.

%! no_dups(?Tmpl, ^Goal) is semidet.
% Looks for duplicate Tmpl's when backtracking over Goal.
% Fails if Goal has no results.
% Throws an error if duplicate Tmpls.
% e.g.: verify that X,Y results are all unique:
%       no_dups(a(X,Y), Z^pred(X,Y)).

:- meta_predicate no_dups(?, ^).
no_dups(Tmpl, Goal) :-
    bagof(Tmpl, Goal, Results), % fails if Goal fails
    msort(Results, ResultsSorted),
    sort(Results, ResultsSortedNoDups),
    (   ResultsSorted == ResultsSortedNoDups
    ->  true
    ;   list_diff(ResultsSorted, ResultsSortedNoDups, ResultsDiff),
        throw(error(dups(Goal:ResultsDiff), _))
    ).
no_dups(Goal) :-
    throw(error(dups(Goal), _)).

%! list_diff(+List, +Delete, -Result) is det.
% A bit different from library(lists):subtract/3
list_diff(List, [], List) :- !.
list_diff(List, [D|Ds], Result) :-
    (   select(D, List, List2)
    ->  list_diff(List2, Ds, Result)
    ;   list_diff(List, Ds, Result)
    ).
2 Likes

Just for the fun

mk(N) :-
    forall(between(1, N, I),
           assertz(p(I))).

pk :-
    no_dups(X, p(X)).

:- meta_predicate no_dups(?, ^).

no_dups(Tmpl, Goal) :-
    findall(Tmpl, Goal, Results), % fails if Goal fails
    msort(Results, ResultsSorted),
    sort(Results, ResultsSortedNoDups),
    (   ResultsSorted == ResultsSortedNoDups
    ->  true
    ;   list_diff(ResultsSorted, ResultsSortedNoDups, ResultsDiff),
        throw(error(dups(Goal:ResultsDiff), _))
    ).

:- meta_predicate no_dups2(?, ^).

no_dups2(Tmpl, Goal) :-
    findall(Tmpl, Goal, Results), % fails if Goal fails
    msort(Results, ResultsSorted),
    \+ append(_, [X,X|_], ResultsSorted).

pk2 :-
    no_dups2(X, p(X)).

pk3 :-
    clause(p(X),true,R),
    clause(p(X),true,S), R \== S.

Now

101 ?- time(mk(1 000 000)).
% 2,000,004 inferences, 0.305 CPU in 0.305 seconds (100% CPU, 6551035 Lips)
true.

102 ?- time(pk).
% 1,000,016 inferences, 0.200 CPU in 0.200 seconds (100% CPU, 5004813 Lips)
true.

103 ?- time(pk2).
% 2,000,117 inferences, 0.296 CPU in 0.296 seconds (100% CPU, 6764841 Lips)
true.

104 ?- time(pk3).
% 2,000,002 inferences, 1.004 CPU in 1.004 seconds (100% CPU, 1991680 Lips)
true.

105 ?- time(pk3).
% 2,000,001 inferences, 0.804 CPU in 0.805 seconds (100% CPU, 2486325 Lips)
true.

The repeated pk3 gives a difference because clause handles are blobs (~ atoms), so the first run creates a lot of them. The second reuses some of these that are not yet garbage collected.

I was a little bit surprised that pk2 did worse than @peter.ludemann’s first solution. SWI-Prolog’s built-in sort is pretty quick thanks to the work of Richard O’Keefe.

edit using nextto/3 instead of append/3 is a little faster, making pk2 only 10% slower than pk.

This doesn’t detect the following duplicated fact:

pred(a,b).
pred(a,b).

but it is detect using my code: no_dups(p(X,Y), pred(X,Y)).

Hmmm … changing bagof/3 to findall/3 means that there’s no need for ^s in _Goal) (The meta_predicate declaration should also be changed from ^ to 0 and the “fails if Goal fails” comment removed).
Nice usability improvement!

My apologies – a typo in my code. :frowning:
(I blame pre-coffee-itis.)