Making a large prolog based knowledge base for bioinformatics/epidemiology

@jan
Okay So I have made some progress with this, I am running swish and rserve in the docker containers.

A couple of things that are not working:

  1. Loading large files in the docker container from config enabled.
    If I have a large .pl file that is approx 8gb in the the config enabled dir, then the swish container fails to load. I have also tried this with a .qlf file which was actually larger then the .pl file and also did not load. I wondered does it matter if the .qlf file is made outside of the docker container? Loading the .pl file in a normal prolog session takes over 30 mins.

  2. Downloading query results error:
    I have gone into the Rserve container and installed some libraries. Then when in swish, I define :

    :- use_rendering(table).

    searchterm_datatable(Search,Table):-
    Table<-{|r(Search)||{ library(alspac)
    setDataDir("/data/")
    vars <- findVars(Search)
    results <- extractVars(vars)
    results[1:10,1:10]
    }|}.

If I then query searchterm_datatable("frog",Table). I get a table of results. But If I click the download csv button I get an error: ERROR: R: Error in parse(text = Rserve2.cmd) : <text>:1:18: unexpected input R: 1: { library(alspac) R:

Any ideas on how to resolve these?

Any clue why? Except for resource limits I see no reason. The default SWISH container does limit the memory size of the SWISH process inside of it. That could be the problem. I think you ca change that with command line options.

An externally produced .qlf is fine, provided it is the same SWI-Prolog version. With 8.1.x it doesn’t need to be the same OS/endian/wordsize, but it needs a compatible set of VM instructions. If not, it will fail to load.

I’d not use a table rendering. Just produce a row and set table results. You can then download results using the download as CSV button. If you use a notebook you can set the query to table the results and specify the number of initial rows to display, so you get the entire table or as much as you want to show directly.

Thanks for the reply Jan.

I think the R error happens when I have a multi line R query.
For example:
https://swish.swi-prolog.org/p/R_error.pl
The first query will error when attempting to download the csv but the second will not.

What is the option to increase the memory?
I start the docker container with:
docker run -it -v /Volumes/ALSPAC-DataBuddy/:/data/my_mount -d -p 3050:3050 --volumes-from rserve -v $(pwd):/data swipl/swish

I have tried:
docker run -it -v /Volumes/ALSPAC-DataBuddy/:/data/my_mount -d -p 3050:3050 --volumes-from rserve -v $(pwd):/data swipl/swish --stack_limit=12g but the container does not start.

Another idea I had was to be able to go into the running swish prolog session on the container and load files. Is this possible? I know I can start a new prolog session in the container by using something like : docker exec -it my_container swipl but this session is not running the swish instance. Is it possible to go into that? so I can run consult etc and have those visible in swish in the browser?

[sorry, missed this]

You change the stack limits of SWISH Pengines using config-available/dim.pl. Install this config using the docker and edit to suit your needs.

Add a file to config-enabled calling

:- initialization prolog_server(4040).

That creates a server that listens to port 4040 on the local host. I think you should then go into the container, install nc (netcat) at run nc localhost 4040 to get a Prolog prompt.

But, you can also add auth_http.pl from config available and setup a password. Now you can work anonymously if you do not login or you can login and have access without a sandbox and thus you can load files and do all the usual stuff from within SWISH. Be aware of the security implications (for both alternatives): a non-sandboxed Prolog toplevel is as powerful as a shell and thus make sure access isn’t too easy and possible access isn’t too bad. The Docker container should be a good start for the second part.