Thanks.
I have been eyeballing that feature since I first saw it but it requires that the data be presented in key sorted order. (ref). Also that would be great for the initial load of the data but then for regular updates it is back to things as normal.
Another option also of consideration and not on my priority list is the alternative Bulkloading by ingesting external SST files but obviously that requires that the SST files were created, thus one needs the chicken before the egg. If there are many users that follow in loading many of these databases (biological in my case) into RocksDB and don’t mind delays of a few days while the the SST files are updated and compacted for each monthly release then these datasets could be distributed as SST files. Also there is no requirement that there be a single RocksDB, one could put each dataset into a different RocksDB and then instead of loading the data just replace the RocksDB with the fresh SST files as needed. Pointer and handle manipulation is becoming one of my favorite swords of choice.
Also did not check Windows Process Monitor see see if any Virus checker or other such process was hooking file updates that need to be configured correctly for this. (ref)
Does SWI-Prolog really need it?
Since RocksDB is meant to be an embedded DB, the files should be local or at least accessible and if one can access the files then one could just write some simple C++ and do it that way. Also since the code should be pretty much boilerplate with some options I would not be surprised if such code is easily found in a Git repository or such.
What do you get from the following query after you load your data and do your queries?
?- forall(predicate_property(rocks_preds:rdb_clause_index(_,_,_), P),
writeln(P)).
EDIT 1 of just this answer
The second attempt almost finished cleanly, seems an RRF file was missing so the code threw an exception entering trace mode and at which point pressing the space bar just completed the commands in the stack, often failing, including failing on rdb_close. Plan to use ldb and other #RocksDB tools to inspect the files.
While your desired query did work this time, the result is not what I think you seek.
?- forall(predicate_property(rocks_preds:rdb_clause_index(_,_,_),P),writeln(P)).
interpreted
visible
static
file(/home/eric/.local/share/swi-prolog/pack/rocks-predicates/rocks_preds.pl)
line_count(348)
number_of_clauses(1)
number_of_rules(1)
last_modified_generation(7084)
defined
tabled
tabled(variant)
size(488)
true.
Obviously something is amiss.
Second attempt clearly loaded more data.
Cumulative writes:
671M writes, 671M keys,
671M commit groups,
1.0 writes per commit group,
ingest:
60.01 GB,
0.50 MB/s
That part of the code is still a mystery to me.
I do know that rocks_preds actually creates two RocksDBs, one for the data and then on in a folder called predicates
that I have just to work out the details.
I did find
to try and inspect the RocksDB files but the precompiled version is not free and building for Windows resulted in an error. Might try a Linux build if I am hard pressed for a way to understand the RocksDB files.
See: RocksDB Tools.
Read the issues:
8081 was surprising.
You really are busy. You asked me to open that one.