Effective implementation of a set

Currently I’m aware of 2 set implementations in SWI:

  1. library(ordset), which is quite inefficient for my current use case (adding and removing items in a tight loop with a maximum size of a million terms)
  2. library(nb_sets), which seems not to have an option to remove an element from the set.

So are there any implementations of sets available that I have not yet found that are fully functional (in the sense that I can do all common set operations on it, adding, removing elements as well as creating unions, intersections and differences) while still beeing effective?

I do not care if they are “mutating” a value nb_sets does or if they do it similar to ordsets with a relationship.

library(lists): List Manipulation

has

list_to_set/2
is_set/1
intersection/3
union/3
subset/2
subtract/3

You have to decide if it meets your criteria.

As those listbased sets do not even need to be ordered, its probably even less efficient than library(ordset). Most of the set manipulation predicates even link to ordset equivalents.

There are many alternatives. What is best depends mostly on what you need to do (frequently) with the sets and what is in the sets. Two options you misses are library(assoc) (or library(rbtrees)), both providing balanced tree based sets. The advantage is that these perform O(log(N)) insert/delete both in terms of time and space. They do not provide the usual intersection, union, etc. operations though, but they can be mapped efficiently to/from ordsets and are thus interesting if you need to add/delete lots of elements.

If your sets are (small) integers you can use integers as bitmaps and express the set operations as simple bit operations. Although bit operations, adding an element is still O(N) as it creates a copy of the bitmap.

Some of the issues are limitations of (pure) Prolog which states that the only thing you can change to a term is binding variables. Using the non-backtrackable primitives you can create data structures with destructive assignment, but this can hardly be called Prolog programming anymore.

Could you a bit clearer on what you are after?

My “use case” currently is just solving Advent of Code 2015 Day 6 (part 1).

It provides a list of instructions to switch on, off or toggle lights on a 1000 by 1000 grid.

The most efficient solutions I’ve seen so far across languages use some implementation of a set to keep track of a lights state and use the sets size at the end as the solution to the problem.

Part 2 is a little bit different and will require some kind of key-value store, as the instructions then will be re-interpreted to increase/decrease a lights brightness.

Currently I use a set of coords (X-Y), though I could “hash” them into X * 1000 + Y. I don’t think though that it will help me to solve the O(n) problem of ordsets.

I’ll take a look though at assoc and rbtree this evening.

Thank you!

Using an integer bitmap and the X1000Y encoding might b e worth trying :slight_smile:

Yeah, it might be worth a try.