[Feasibility] Custom file format containing records and query it with prolog

I have some C++ code which reads/write this file format.
This file format consists in a list of heterogeneous records.

Think of something like:

Player(player_id=10, player_name=LeBron James)
Team(team_id=1, name=Lakers)
PlaysFor(player_id=10, team_id=1)

We have lot of infrastructure around this file format. So I was wondering whether it is feasible for me to do the following:
I would like to write a Prolog predicate which is able to read this file format and have it generate facts so that I can leverage SWI-Prolog expressiveness, ideally I would like to accomplish this by writing a C++ extension. This seems in the realm of the possible but I cannot find any example.

Can you point me to an example or some relevant documentation for this purpose?

If you already have C++ code reading this, adding a C++ predicate is probably the way to go.
See http://www.swi-prolog.org/pldoc/package/pl2cpp.html

I had already given a quick look to that documentation.

What I do not understand is how can I add a new “Fact”.

let’s say that I create a Predicate, read_from_my_fileformat with arity 1, taking in input the path to the file, how can I have add a “new fact” as I read the file?

Can you point me to the part of that documentation which you think is very relevant for my use case?

What I do not understand is how to yield a new fact from within the PREDICATE macro, basically.

I apologise if my wording is not compliant and probably I am not using the standard prolog terminology!

You have roughly two options. One is to define a predicate in C++ that reads the next fact from the file. Then you create a loop in Prolog that calls this predicate to get a new fact and than calls assertz/1 to add the fact to the Prolog database.

Another option is to do more or less the same from C++: you loop through the file and for every fact you find you can Prolog assert/z through the C++ interface. It all depends a bit on your expertise, what you already have in C++ and what you want in Prolog.

I think that assert/z was the bit I was looking for!

I will now take my time to go through the documentation, I think your answers gave me all the material I needed to at least be able to come up with a prototype.

Thanks

Is there anything else I need to consider if I went ahead with the assert/z call from C++ land?

Also, I am keeping an open mind about my use case, is the above something which makes sense from a prolog perspective?

My idea is to leverage prolog in order to add querying capability to my some external stuff.

Here is the elevator pitch.

Having done something similar but all on the Prolog side.

Reading a file is done with predicates like read_stream_to_codes/2.

The parsing would be done with DCG

The creation of facts can be done with library(persistency)

The reading of large fact files quickly can be done with Quick load files.

1 Like

Appreciate that, thanks to the both of you for your help in getting me started.

Just a quick note @AntonioL abvout your source format:

Player(player_id=10, player_name=LeBron James)
Team(team_id=1, name=Lakers)
PlaysFor(player_id=10, team_id=1)

If you replaced “(”, “=” and “,” with SPACE, then you could probably manage without a DCG as:

Player player_id 10 player_name LeBron James
Team team_id 1  name Lakers
PlaysFor player_id 10  team_id 1

Then the line format is
token1: “topic/category/record-type” whataver!
token 2,3 (n,n+1) are just Key and Value

Don’t know if that helps much!

?- split_string("Player(player_id=10, player_name=LeBron James)", "(=,", ")", X), [Class|KV]=X.
X = ["Player", "player_id", "10", " player_name", "LeBron James"],
Class = "Player",
KV = ["player_id", "10", " player_name", "LeBron James"].

Gotta love SWI Prolog!

I have a similar need, except that the input is in JSON. I output the the facts using write_canonical and then consult them in a separate program (a detail: I use load_files/2).

One little trick is to create both a .pl and an empty .qlf file; the first time the file is consulted, the .qlf file is flagged as invalid and updated, so the next time you consult the file, it’s much faster (and if you update the .pl file, the .qlf file will also be automatically updated the next time the file is consulted).

4 Likes