Thinking about this a bit more …
The persistent predicates project currently allows only a single key but it shouldn’t be too difficult to modify it allow multiple keys (similar to just-in-time indexing of clauses).
And a similar approach should work for rbtrees. I’ll try implementing it - and after doing this, I’d do something similar for persistent predicates.
Let’s call the new data structure “mkrbtree” (for multi-key-rbtree"). With each mkrbtree, there would be a list of keys and an associated predicate that defines which key to use. For example, if we’re storing triples node(From,Edge,To)
, then we might define
data_index(node(From,_,_), Index, Key), ground(From) => Index = 1, Key=From.
data_index(node(_,Edge,To), Index, Key), ground(Edge-To) => Index = 2, Key=Edge-To.
data_index(node(_,_,To), Index, Key), ground(To) => Index = 3, Key=To.
data_index(node(_,Edge,To), Index, Key), ground(Edge) => Index = 2, Key=EdgeTo.
data_index(node(_,_,_), Index, _) => Index = 0.
which would define 3 indexes: on From
; on a combination of Edge
and To
; and on To
.
So, with this data:
[node(n1, road, n2),
node(n1, train, n3),
node(n2, road, n3),
node(n3, road, n4)]
there would be three keyed lists (stored as rbtrees) plus one unkeyed list:
1-[n1-[node(n1,road,n2), node(n1,train,n2)],
n2-[node(n2,road,n3)],
n3-[node(n3,road,n4)]]
2-[(road-n2)-[node(n1,road,n2)],
(train-n2)-[node(n1,train,n2)],
(road-n3)-[node(n2,road,n3),
(road-n4)-[node(n3,road,n4)]]
3-[n2-[node(n1,road,n2],
n3-[node(n2,road,n3),node(n2,train,n2)],
n4-[node(n3,road,n4)]]
0-[node(n1,road,n2), node(n1,train,n2), node(n2,road,n3), node(n3,road,n4)]
mkrb_lookup/3
would call data_index/2
to determine which index to use, then call rb_lookup/3
on the appropriate keyed list; if no key matches, then use member/3 on the unkeyed list. Siimlarly, mkrb_insert/4
would add to all the keyed lists (using rb_insert/4) plus to the unkeyed list.
This looks like a lot of data duplication, but the nodes would be shared amongst all the lists, so the only extra space would be for the indexes.
Anyway, this is a rough idea of how multiple indexes could be used with key-value data to do lookup as efficiently as clause lookup with indexing. Does this seem reasonable? (Some small details would change, of course; in particular, data_index/3
would be a bit different, to allow using both for lookup and insert, probably using once/1 instead of SSU.)