Memory use for database

The overall idea I have once discussed is to taken the ideas for database column stores as implemented by MonetDB as a start. A column is basically a simple native array of objects of the same type (with some more complexity for string columns). That is also what e.g. R uses for representing vectors. The Apache Arrow project is also related.

The nice thing is that you can put that in memory or you can store the column in a file and use memory mapping. Next step is to make this available as a predicate, combining one or more such columns (of equal length) and adding indexes to them.

There are lots of decisions to take, such as which of the above projects to reuse, can we reuse indexing techniques? How about modifying the table? If we do, can we have Prolog’s logical update view? How can we use this to efficiently share date with R, MonetDB, etc?

This is a major project that is waiting for a good use case and involvement to get it realized (either as paid development or contributed code). A first version can simply be implemented as a normal C/C++ foreign plugin. Later versions could move part to the VM to avoid the foreign language call overhead.

1 Like