Persistent predicates based on RocksDB

While library(persistency) allows for tracking and restoring clauses associated to a predicate, it doesn’t scale all that well. After some recent scalability discussions I though to explore the option to store clauses in RocksDB. The intermediate results are on Github. Below is a copy of the README. The current module provides this interface:

:- module(rocks_preds,
          [ rdb_open/2,                 % +Dir, -DB
            rdb_assertz/1,              % +Clause
            rdb_assertz/2,              % +Dir, +Clause
            rdb_clause/2,               % +Head,-Body
            rdb_clause/3,               % +Dir, +Head, -Body
            rdb_clause/4,               % +Dir, +Head, -Body, ?CRef
            rdb_nth_clause/3,           % +Head,?Nth,?Reference
            rdb_nth_clause/4,           % +Dir,+Head,?Nth,?Reference
            rdb_load_file/1,            % +File
            rdb_load_file/2,            % +Dir, +File
            rdb_current_predicate/1,    % ?PI
            rdb_current_predicate/2,    % +Dir,?PI
            rdb_predicate_property/2,   % :Head, ?Property
            rdb_predicate_property/3,   % ?Dir, :Head, ?Property
            rdb_index/2,                % :PI, +Spec
            rdb_index/3,                % +Dir, :PI, +Spec
            rdb_destroy_index/2,        % :PI,+Spec
            rdb_destroy_index/3         % +Dir,:PI,+Spec
          ]).

A lot is missing, such as rdb_abolish, rdf_retract, rdf_retractall, combined argument indexes, update the index on assert and (notably), actually running the predicates rather than only providing clause/3 access. Performance can probably be boosted a couple of times by moving the encoding and decoding of database keys and values from Prolog to C++.

I know persistent predicates have been part of several Prolog systems in the past. I’m wondering what the experiences are with these and what functionality is crucial?

Thanks --- Jan

Readme from the repo

Store predicates in RocksDB

This library builds on top of the rocksdb add-on. Right now it requires the GIT version of SWI-Prolog and recompiling the rocksdb pack from source.

The idea of this library is to see whether we can build a persistent predicate store on top of RocksDB. The current implementation is there to access different database organizations to access how realistic this is and what kind of performance and scalability is achievable.

Database organization

The database maps string to Prolog terms. The key is a string because this allows for a RocksDB prefix seek (using rocks_enum_from/4) to enumerate objects. Keys:

  • <PI>\u0001<0-padded hex clause no> → (Head :- Body)
    Key for the Nth clause or PI.
  • meta\u0001<PI>true
    Key we can enumerate to find defined predicates
  • <PI>\u0002<Property> → Term
    Key for properties of PI
  • <PI>\u0003<Index> → Status
    Key for a clause index created on PI
  • <PI>\u0004<Index>\u0002<Hash> → list(ClauseRef)
    If arguments related to Index have term_hash/4 Hash, return list
    of candidate clauses.

First results:

Measured on AMD3950X based system, 64Gb mem and M2 SSD drive.

Wordnet 3.0

  • 21 files, 21 predicates, 821,492 clauses, 34Mb source text
  • Load time: 11.7 sec.
  • RocksDB size: 99Mb
  • rdb_index(hyp/2, 1) (89,089 clauses) → 1.35 sec.
  • random query time on hyp(+,-): 10usec

RDF (Geonames)

  • From 1.7Gb HDT file (+ 850Mb index)
  • Load 123,020,820 triples in 2088 sec (CPU)
    • Load performance (clauses/sec) is constant.
  • RocksDB size: 5.2Gb
  • Count triples: 165 sec (152 sec for HDT).
  • rdb_index(rdf/3,1) → 1383 sec.
  • random query time on rdf(+,-,-): 30usec (HTD: 7.8usec)

Future

For now, the access predicate is rdb_clause/3. This is fine for facts. We could execute the code by calling the body as below.

p(X) :-
    rdb_clause(p(X), Body),
    call(Body).

The disadvantage of this is that the cut is scoped to Body in that case. This can be fixed by interpreting the body or have a call/1 variation that does not scope the cut.

Cashed execution

  • If a predicate is small, simply extract it from the database and call
    it.

  • If it is larger, create a predicate

    p(Index, Hash, Arg1, … ArgN) :-
    Body.


7 Likes

Just to give some idea of what I am doing with library(persistency).

The use case is for parsing biological databases into facts. Once the fact file is created and the data is coherent it is converted into a SWI-Prolog Quick Load File (QLF) at which point there is no need for a non-Prolog datastore. Since QLF files loads large Prolog files fast (ref), there is no need for further optimizations for this use case.

As these database are typically only updated on a monthly basis or need to have the data loaded on a monthly basis, transactional processing can be avoided.

Currently library(persistency) combined with QLF fits well with my use case.


Personal notes

As of 07/10/2022 the only way to effectively work persistent predicates based on RocksDB is to use specific Git commits of the source.

Commits · SWI-Prolog/swipl-devel · GitHub
Commits · JanWielemaker/rocksdb · GitHub
Commits · JanWielemaker/rocks-predicates · GitHub
Commits · facebook/rocksdb · GitHub


Currently building specifics.

07/10/2022
OS: Ubuntu 22.04 LTS
SWI-Prolog: Install via PPA development version 8.5.14
RocksDB pack commit: e253458
RocksDB commit: a9565ccb2
rocks-predicates commit: 3071007

I would really, really like to try reasoning over SemMedDB with Prolog. There has been huge research success with this database in the past. I recently tried to put it into QLF files, but it’s too huge to fit in memory, unfortunately. The RockDB solution would allow to easily achieve my objective!

Is the size of the database limited by the amount of RAM? Or is RAM used only as a cache?

Given modern hardware with a lot of RAM and pretty fast SSDs, using RAM as cache gives pretty decent performance. RDF storage using HDTs simply uses mapped files. RocksDB uses classical read/write (at least, examining the process shows no maps and a lot of open files).

It appears that this can index only over a single argument. How hard would it be to index over two arguments (the way JITI does)? This is useful for storing “triples” or node/edge graphs.
From reading the code, I infer that rdb_clause_index and rdb_candidates picks an index only if the arg is nonvar, so I’m guessing that it would be fairly easy to add indexes over multiple arguments, and that the programmer should define the indexes in descending order of how much speedup they give (it appears that there’s a “choose the first one” strategy rather than “choose the best one”).

Here are the indexes I have right now, according to jiti_list/0 (the nodes and edges are triples but with compound keys comprised of 5 items). The raw facts are 150MB (1 million rows), but I want to scale this by a factor of 100x or more. (My current code uses atoms; but I presume it would be better to use strings with rocksdb?)

Predicate Indexed  Buckets Speedup Flags
color_line/6 3+5    65,536 43646.6   
             3         256    65.2   
node/7       1     262,144 63330.2   
             4+7   262,144  4589.7   
             7     131,072  1061.1   
edge/11      7     131,072  9494.9   
             1     131,072  9494.9   
             6          16     5.4   
1 Like

Adding more complicated indexing is fairly easy. The ordering is indeed pick first. That is also used for JIT indexes, but there the ordering is done automatically so first implies best. A complete implementation requires assessing the quality of the index. The JIT indexing uses a combination of the average number of clauses per hash bucket and the standard deviation, preferring uniform distribution over non-uniform.

2 Likes

This is a bit off-topic since I have no first hand experience at RocksDB. Looking at all the open source projects that use it, it appears a good choice of underlying library. My perspective is from a Prolog hacker worried about how I’m going to access it.

I haven’t used the Redis library but know enough to understand it’s a key-value store aimed at RAM rather than a persistent key-value store which is RocksDB’s niche.

I foresee a lot of confusion over which is the correct tool from SWI-Prolog application developers like myself, and my concern is more about simplifying the Prolog interface and making it easy to switch as needs change and understanding of the problem evolve.

I’ve got a little experience with Erlang Terms Storage (ETS) where tables can be one of four kinds: no duplicate keys allowed (sets), ordered sets (sets which can be iterated over in a predictable way), duplicate keys allowed provided they have different values (bags), duplicate keys and values allowed (duplicate bag).

I find this need for four different types of table a drawback Prologish languages have when used as databases. I don’t think the language itself needs to be altered, but conventions need to be established.

I tend to combine SWI-Prolog with PostgreSQL, using the predicate name as the table name, and use the first argument as the key, usually an unique integer automatically generated by PostgreSQL. That’s my house-rule, and it would be interesting to hear some ideas on best-practices and on how to create a general interface to key-value stores.

2 Likes

I doubt there is an easy answer. Key-value stores differ significantly in their capabilities. Redis, which you refer too, has a rich set of data types. It is great for dealing with cooperating (micro) services. RocksDB provides a really simple Blob → Blob mapping. Depending on the functionality and properties for a specific key-value store one probably needs to make different decisions.

The setup I am experimenting with should eventually (If we decide to pursue) should provide (near) complete transparency, i.e., a predicate maintained in the database has nearly all properties of a normal Prolog predicate. Eventually that may lead to something like

:- persistent p/1, q/2, ... .

Using a relational database is of course another option. Now you can opt for creating a predicate view on a normal relational table (I think that is what you do). The main price is probably that you can not store arbitrary Prolog terms. Often that is fine. You may populate or use the same database from other languages as a bonus.

In various similar exercises we see one decision that needs to be made each time: do we design the interface to (also) make the data accessible by other languages or is this a pure Prolog tool where we want to be able to deal with dynamic Prolog data structures, sometimes including variables.

2 Likes

Seconding what Jan W. noted

One case where being able to store variables and then bind them latter is when converting data and then committing the data before the key is known. This is similar to red-black trees with variables.

Once one learns of this and how to effectively use it, one does not want to give it up. :slightly_smiling_face:

I don’t think that this would be possible with a persistent database. What does it mean to have a variable in a persistent database over multiple sessions? (I can see a possible use for variables in a single term that’s stored as a serialized value in the database, but I can’t see how you’d share variables across multiple values.)

I don’t know at the moment however the more I learn to embrace the use of variables in places where they would normally not be considered the more the possible ways of solving a problem start to appear.

The approach that Google takes (and I think that Facebook does something similar) is that the values are protobufs, which you can think of as similar to binary encoded “plain old data structures” (numbers, strings, enums, vectors, nested structures). Often, the keys are hashes, to ensure that the processing load is spread evenly over multiple machines. Bigtable is a layer on top of a key-value store like RocksDB that stores the meta-data, such as the protobuf type for a column. But there’s no requirement that the values be protobufs - it would be up to the table owner whether the values are Prolog terms or data that can be processed by non-Prolog systems. (And, of course, you could also store Prolog terms as strings, using term_string(Term, String, [ignore_ops(true), quoted(true), quote_non_ascii(true)]).)

Dear @joeblog,

Mapping db tables to predicate facts, it is a really useful abstraction.

pack(db_facts) has basic common interface to map odbc and (pro)sqlite tables to facts.

pack(bio_db) uses hot-swappable code to decide on the fly
which back-end the user wants for specific predicates.
From: in-memory, berkeleyDB, prosqlite and rocksDB.

Regards,

Nicos Angelopoulos

db_facts: https://stoics.org.uk/~nicos/sware/db_facts/
bio_db: https://stoics.org.uk/~nicos/sware/bio_db/

2 Likes

in applications i also needed to have persistent data which can be updated fast and safe. in many cases, for example for contact-moment data of for calendar data you can append the data in a file without using an index.
when there is an index needed i append the index in a separate index-file, which stores the index together with the file-position of the belonging record-file. in the record file you can choose to use fixed length if you want to be able to update them fast. all this programmed in swi-prolog code. important swi-prolog function is seek which goes to file position in the record file. when i found out that databases like mysql or postgress can also crash, i chose to use this
dbutil.pl (51.5 KB)

1 Like

For indexed predicates, rdb_clause/3 does two access to the rocksdb - the first uses the index to get a CRef, and then the CRef is used as a key to get the clause. Wouldn’t it be faster to simply store the predicate with the index entry, so that there’s only one access? (this would use a bit more space, of course) And, for facts, is there any reason to store the “:-true” part?

PS: Shouldn’t there be an rdb_close/1 predicate? (Maybe rdb_close/0 also?)

1 Like

Might make sense to simply duplicate the clause. You can see I typically live in memory :slight_smile: Whether or not saving :- true is better or worse one would need to check. The code is now a bit simpler. All in all, the Prolog overhead isn’t too much though and saving some cycles on the database end quickly outweighs a bit extra work in Prolog. RocksDB is closed on exit AFAIK, but an explicit close can of course do no harm. Just play with it and send PRs :slight_smile: For anyone using it it is probably not wise to store giant amounts of data yet (except for tests) because further changes in the representation are likely.

1 Like

From reading the code, it appears that rocks_pred works equally well for atoms or strings, so it seems to be better to use strings as args in a predicate (unless the args are also used for non-rocks_pred predicates). That is, it’s better to store foo("bar") in the database than foo(bar).

(This avoids the overhead of creating unique indexes for atoms when doing the equivalent of read_term/2 from the database.) Maybe this isn’t an important optimization, but it could be substantial if there are millions of facts (for example, I have some code that does exhaustive validation checks of the facts - the created atoms would be garbage collected, I think; but it would be more efficient to never create the atoms in the first place).

Also, it might be good to have versions of the “assert” and “clause” predicates that make no guarantees about ordering. This could allow for more interesting optimizations with multiple indexes and hashing. (In the database world, it’s considered bad style to have implicit ordering; and in general queries need an ordered by clause if they want the results in a particular order.

Perhaps the persistent data could have some options information stored with it? Probably a good idea to also have the version of the rocks_pred module, so that we can at least detect incompatible changes.

From looking at the examples, it appears that db_facts associates predicate arguments with table columns, which means that the arguments are limited to SQL datatypes (strings, numbers). Is this a correct understanding?

OK. :wink:

It’ll take me a bit of time to make a switchable version of my code, so that I can compare the performance (memory, CPU) of keeping all my facts in memory with using rocksdb for the facts.