Sorry for the long post, it seems like getting the background down here would be helpful:
I’ve got a database of about 3000 triples that represent what I think is a relatively classic “Knowledge Representation and Reasoning” database that describes a world and allows the user to walk around and examine it. It’s an interactive fiction type game. The data is stored in triples like rel(player, isa, person)
ala RDF.
The current performance is not great and I also need the knowledge base to be able to be much bigger. So I’m looking for ways to limit the raw data the queries need to chug through since that is one of the limiting factors.
One thing I’ve noticed is that, analogously to rendering graphics in a 3D environment, a lot of the knowledge is invisible depending on the user’s location. By this I mean that if I were to completely remove those “invisible” triples from the database, the queries would all return the same results and the experience would be identical. However, right now these triples are just sitting there being retrieved, running through queries that fail, and being subsequently ignored…and burning performance.
For example:
- The player is in the Conservatory and they ask any question about the kitchen. Since that is another room you can’t see, they will all fail.
- There is a box of 50 different toys in the bedroom. When the player is in the living room, we shouldn’t have to cycle through all of these toys when they ask “what can I see?”, since they are “out of scope” and nothing you can say or ask about them will succeed.
etc.
To reduce the amount of irrelevant information SWI Prolog has to fish through, I’m experimenting with changing the lowest level data interface predicate I have, rel(Subject, Relationship, Object)
, to remove some of the obviously useless information based on the player’s location. I do this by:
- Adding an extra
Shard
argument to the raw data predicate I’m using:store(X, Rel, Y, Shard)
. It is acting like “Sharding” in other data stores. If a piece of information should be everywhere, Shard is_
. Otherwise, it is an atom that only matches when the user is in thatShard
. - Before running a query the caller calls
setShard(Name)
whereName
is an atom representing the location of the user. (I realize I could be passing an argument through, but that involves a ton of code changes at this point and I’m trying to keep the experiment simple…)
setShard(Name) :-
b_setval(shard, Name).
getShard(Name) :-
( nb_current(shard, Name)
-> true
; true
).
rel(X, Rel, Y) :-
getShard(Shard),
store(X, Rel, Y, Shard).
% Data to be "sharded": store(X, Rel, Y, Shard)
store(toy1, locatedIn, bedroom, bedroom).
store(toy2, locatedIn, bedroom, bedroom).
store(table, locatedIn, livingRoom, livingRoom).
Then you could use it like:
?- setShard(livingRoom), rel(X, locatedIn, livingRoom).
X = table.
My problem is this: Some pieces of information need to show up in a couple locations but I’d like to get the benefits of indexing on store/4
. Setting Shard to _
for that data is unnecessarily broad but setting it to a single atom is incorrect (too narrow). I’d like to logically set it to something like this pseudocode:
store(visibleInBoth, locatedIn, once((Shard == livingRoom; Shard == bedroom)) ).
… but I can’t figure out a way to do that using just unification so I can keep the JIT indexing of store/4.
Any ideas?