O(1) set containment test?

I’m running into an efficiency issue. In order to keep a search tractable, I’d like to exclude previously visited solutions:

search_without_revisit(StateID, StatesVisited, StatesVisited_) :-
    set_not_contains(StateID,StatesVisited),
     ... more constraints ...,
    set_add(Path, StateID, StatesVisited_).

How should Path be represented so that the cost to check whether StateID is in Path is O(1). I don’t see a set datastructure in Prolog (which is surprising since so much of logic depends on reasoning about sets).

How big are your sets?

library(rbtrees) and library(assoc) are order O(log N), which often is good enough. arg/3, setarg/3, nb_setarg/3 can be used on terms, and they’re O(1). I’m not sure of the limit on term arity; it should be at least 1024.

1 Like

No artificial limit in SWI-Prolog. I have a faint memory of reading this in the manual somewhere, but now I cannot find it any longer.

But lookup in a flat term is not O(1) as far as I know? Using arg/3, you can get the value at an index, but you cannot look up for a value in constant time. Adding elements to a term is probably also not constant time.

In addition to the suggestions by @peter.ludemann, there is also library(ordsets).

There are also hashtables. (Edit: now that I think about it, this is the only obvious data structure with constant-time lookup. Or am I wrong?)

I guess you would be looking for a set that is relatively efficient at both insertion and lookup?

For a set, you can take any of the key-value representations in Prolog and just set the value to the same atom, for example true.

1 Like

There is no arity limit on terms, but there is one on predicates (indeed 1024). That makes terms fixed-size arrays, which doesn’t seem to solve the OP’s problem unless the states are represented as an integer with a known upper limit that is small enough to make the array not ridiculously large.

O(1) for a dynamic set representation basically does not work using only logical variables. Best you get is O(log(N)) using balancing trees (e.g., library(assoc) and library(rbtrees)). Properly rescaled hash tables using a suitable hash function indeed realize (in practice) O(1), but there is (AFAIK) no realistic cheap implementation possible using Prolog. Its hash table library uses destructive assignment using setarg/3. Destructive data structures are fine, but one should be pretty careful using them as unintended copying can lead to strange results. For the original problem they are probably fine, but I recall some measurements (by @peter.ludemann?) that trees out perform hash tables up to pretty large set sizes.

1 Like