Tutorial on accessing external databases

As a new user of prolog I would appreciate comments/corrections to the tutorial I’m writing for the DBTechNet.org group introducing SWI-Prolog accessing the databases in our free Debian 10 based virtual laboratory. Draft of the tutorial can be found at SWI-Prolog and Databases on Debian10 VM

4 Likes

When is the last date and time we can respond such that it could be of use as an edit for the tutorial?

EricGT, thank you for asking. The tutorial is under development without a deadline.

1 Like

While I know the focus is on using SWI-Prolog with ODBC databases are you aware of library(persistency): Provide persistent dynamic predicates ?

If you are starting with a clean project and do not need to use an existing database this might be a better option. Using it with Quick load files is a synergistic option. AFAIK all of the data has to be loaded into memory but I have used these with Gigabtyes of data. (ref)

EDIT

So as not to add lots of independent single post I will just let them pile up here as edits. And to make it easier for others to ignore, make use of the Hide Details.

Click on triangle to expand

SWI-Prolog is also provided as a Docker container (ref) but many prefer to either start with SWISH, or just install on their machine.

library(aggregate): Aggregation operators on backtrackable predicates

library(record): Access named fields in a term

Since I have a need to learn more about how to setup Discourse with Docker and this has some connection points with what you are doing, I will be using your document and these in combination.

Eric, thank you for your reply. We are not working on any applications, but building our Debian based virtual lab as educational sandbox with free DBMS editions, programming languages and tools. In that perspective SWI-Prolog is a hands-on learning tool on logic programming. It could act as a client system or component which can load data also from external DBMS servers, and might perhaps write some results for some applications via the databases.

From technology point of view I would like to understand the role of the SWI-Prolog database on disk:

· Is it a capacity extension of the in-memory database, a replacement, or a read-only backup?

· Is the content written only by SWI-Prolog system, or can the content be loaded from external data sources, perhaps Big Data appliaction?

Is there any “easy-reading introduction for dummies” on the subject?

pe 8. toukok. 2020 klo 15.36 Eric Taucher via SWI-Prolog (swiprolog@discoursemail.com) kirjoitti:

Maybe someone else might come up with a good reference.

Here are my personal impressions: if you want to benefit from the full range of tools that SWI-Prolog provides in terms of modelling and querying data, you need to use the in-memory database.

If you are willing to give up some freedoms (in terms of how you model the data and how you query it), there are many option. Here is a list that is not exhaustive:

  • you can connect to a relational database
    • using ODBC, to any database that supports it
    • with prosqlite you can connect to an SQLite database
    • with CQL (also uses ODBC)
    • notably, using Datalog with DES. This also works as a SWI-Prolog library.
  • you can use a key-value database
    • BerkeleyDB using bdb
  • you can access read only large static files using external tables

Hopefully someone extends this list.

If I understand correctly, library(persistency) does not solve your problem if you have large amounts of data that would have to transparently go from the memory to disk.

So actually, I need to highlight DES. For educational purposes, it is all you can wish for. For professional use, it is not ideal, but the fault does not lie with DES or with Datalog or with Prolog. The reason why I think that it isn’t truly useful is that relational databases are different enough so that you cannot easily make a one-size-fits-all layer on top of them.

The different databases are different, with different markets, different extensions and very useful advanced features, different procedural language extensions, and so on. Maybe a version of DES that is aimed only at Postgres would solve some of the problems? I lack the depth to be able to tell.

1 Like

Martti, thanks for asking.

I will take SWI-Prolog database on disk to mean library(persistency)

It is not a capacity extension of the in-memory database.
It is not a replacement of the in-memory database as it is not a means to query facts and run predicates stored using this means.
It is not specifically a read-only backup but could be used as such.

If you have new data that is coming into a running instance of SWI-Prolog, e.g. reading a twitter stream, and you want that data to be available after the instance is shut down, then persisting the new data to a file is accomplished using library(persistency). You can then shut down the SWI-Prolog instance knowing the data will be available latter. When an instance of SWI-Prolog is started and needs the persisted data, using library(persistency) the data can be loaded and then more added as needed.

In other words SWI-Prolog without adding features, all of the data must be loaded into memory. Once the SWI-Prolog instance is halted and killed, all of the data in memory is lost. Now if the data existed in a *.pl file and was loaded into the instance this can be done again and again, but that does not allow for new data to be added without editing the *.pl files. library(persistency) gives you the feature of saving new data when an instance of SWI-Prolog is running, and being able to have it again after halting and starting a new instance of SWI-Prolog.

I don’t exactly understand that, so will pass at even attempting an answer.

No and I wish there was. I have written two detailed post here about using library(persistency) and while the second is much better than the first, neither one is even near general enough or talks enough about the entirety of library(persistency) to be of use as an introduction. There is enough examples in them to get you going, but the examples are very specific to the post. Also the first one is very confusing in itself so skip it if you can.

Trying to understand library(persistency
Solving two consecutive dependent goals from command line

Yedalog (at Google) isn’t Prolog; it’s a variant on Datalog that allows programs to be agnostic to the form of the data (flat files, bigtable, SQL, etc) and form of access (database queries, mapreduce, etc) and scaling (workstations, multiprocessor, cloud, etc). There might be some useful lessons from this work.

(A Google search for “yedalog” will get you the primary articles and some talks)
(I don’t know the status of this work; at least 2 of the original team members have left Google over the years)

TerminusDB is another kinda way to look at doing DBs if the data model fits your needs.

1 Like

Is TerminusDB an in-memory database? I wasn’t able to quickly find this info from the website.

1 Like

not that I am aware of, sorry, that does limit things greatly.

Dear Martti,

It might be worth have a quick look at some of the papers in:
http://stoics.org.uk/~nicos/pbs/

  1. Advances in big data bio analytics
  2. Accessing biological data as Prolog facts
  3. ProSQLite: Prolog file based databases via an SQLite interface

Regards,

Nicos Angelopoulos

http://stoics.org.uk/~nicos

Dear Nicos,

Thank you for the lead to impressive collection on papers. One of the supported DBMSs in our lab is SQLite3, and I have earlier looked also at your paper on db_facts.pl.
Among the languages in our educational virtual lab, we also have Common Lisp, but for ODBC connections I have not found working open source solutions. Do you know any?

Br Martti

la 9. toukok. 2020 klo 23.04 Nicos Angelopoulos via SWI-Prolog (swiprolog@discoursemail.com) kirjoitti:

Dear Martti,

Apologies for the late reply.

Unfortunately I am not familiar with any implementation of Lisp.
My foray into interface programming with proSQLite and Real was specifically so I can spend more time programming in Prolog…

Thanks,

Nicos

TerminusDB is an “in memory Database” in the sense that it needs to be able to fit all queryable objects in memory and does not perform paging. It is however persistent and assures ACID transactions to disk. You can safely store, shut down Prolog and reload your work without difficulty.

There is also terminus_store_prolog, a Prolog pack which works in SWIPL which gives a more direct access to the underlying graph representation from Prolog, but which is less convenient for rdf, reasoning and does not have the branch/merge/push/pull logic that we are building into the full terminusdb.

All of these are GPL by the way and should be possible to include in Debian without problems!

3 Likes