Module issue in pengines

I’m using pengines from Python, and getting syntax errors that suggest a malformed query, when the query is not malformed. It only happens when I call something that tries to use sCASP to prove a goal defined in the source text parameter. If I call the same predicate and have it use sCASP to prove a predicate defined in the modules that are loaded by the pengines server, the problem doesn’t happen. I suspect that it is a whitelisting issue, and to test that theory I would like to temporarily disable the sandbox restrictions and see if it goes away. The documentation says that it’s possible, but I can’t see where it indicates how to do it. How is that done?

Thanks in advance.

I doubt this is the problem. More looks like a module issue. See the SWISH sources for adding sCASP to the system (in config-available). To disable the sandbox, see pengines.pl. There are two hooks: authentication_hook/3 that allows connecting an HTTP request to a user and not_sandboxed/2 that allows disabling the sandbox for a particular application and user. You can make that as (un)safe as you like, ranging from demanding proper HTTPS authentication to just claim any request comes from a particular user without any verification and allow that user to do anything.

Thanks, I’ll try that, but honestly I feel like I don’t understand pengines well enough at this point to imagine how to use those predicates. The only thing I’m doing right now is starting the server, and calling that goal when I run SWIPL. Is there an example somewhere?

Also what do you mean by module problem? Do I need to call user-specified predicates in “source_txt” with a module name in the “ask” parameter, or something?

I have not been able to figure out how to disable the sandbox, but I suspect you were right that there is a module problem. After fighting with it for another couple of hours, here’s the problem I’m having in detail:

This is the code that is being loaded when swipl runs, with the goal of server(8080).

% scasp_server.pl
:- use_module(library(http/http_server)).
:- use_module(library(pengines)).
:- use_module(library(sandbox)).
:- use_module(library(scasp)).
:- use_module(aggregates).
:- use_module(dates).
:- use_module(events).
:- use_module(ldap).
:- use_module(passthrough).
:- use_module(library(scasp/output)).

blawxrun(Query, Tree, Model, Attributes) :-
    scasp(Query, [tree(Tree), model(Model), source(false)]),
    ovar_analyze_term(t(Query, Tree, Model), [name_constraints(true), name_prefix('Var_')]),
    (   term_attvars(t(Query, Tree, Model), [])
    ->  Attributes = []
    ;   copy_term(t(Query, Tree, Model), _, Attributes)
    ).

server(Port) :- 
    http_server([port(Port)]).

I then run swipl, run use_module(library(pengines)). and run the following three queries:

?- pengine_rpc('http://localhost:8080', winner(X,Y),[src_text("winner(test,jason).")]).
X = test,
Y = jason.

?- pengine_rpc('http://localhost:8080', blawxrun(true,A,B,C),[src_text("winner(test,jason).")]).
A = query-[],
B = C, C = [].

?- pengine_rpc('http://localhost:8080', blawxrun(winner(X,Y),A,B,C),[src_text("winner(test,jason).")]).
ERROR: scasp_predicate `'user:winner'/2' does not exist
ERROR: In:
ERROR:   [18] throw(error(existence_error(scasp_predicate,...),_121072))
ERROR:   [13] setup_call_catcher_cleanup(pengines:pengine_create(...),pengines:wait_event(...,...,...),_121106,pengines:pengine_destroy_and_wait(...,'fb2e42ea-55d2-4390-9b33-d32645b9b22d',_121146)) at /usr/lib/swipl/boot/init.pl:678
ERROR:   [11] toplevel_call(user:user: ...) at /usr/lib/swipl/boot/toplevel.pl:1318
ERROR: 
ERROR: Note: some frames are missing due to last-call optimization.
ERROR: Re-run your program in debug mode (:- debug.) to get more detail.

I have no idea why, if both blawxrun and winner are available in the default module, blawxrun can’t find winner. I wonder if this line from the error message gives a clue: ERROR: [11] toplevel_call(user:user: ...) at /usr/lib/swipl/boot/toplevel.pl:1318. I don’t think user:user is what we want, there, is it?

The mistake (I think) you are making is that Pengines do not inherit from user, but from the “Pengine application” module, pengine_sandbox (by default). So, to load sCASP you need to do

:- use_module(pengine_sandbox:library(scasp)).

All the other code you want to be available to Pengines must be loaded into the pengine_sandbox module. Typically you do that by implementing a module and load it as above. You can also add stuff to thie module directly using this syntax:

  pengine_sandbox:mypred(...) :- ... 

As for using the hooks, you use

:- multifile pengines:not_sandboxed/2.
not_sandboxed(_,_).

and something similar for the authentication hook. But again, s(CASP) does not need to lift the sandbox AFAIK and for security reasons you better do not :slight_smile:

That’s extremely helpful, @jan, thank you. I’ll give it another shot later tonight, but that makes a lot of sense.

With your help I was able to confirm that it’s not a sandboxing issue, it’s a module problem.

But following the instructions you gave, I still have the same module issue.

I’m loading the following file:

% scasp_server.pl
:- use_module(library(http/http_server)).
:- use_module(library(pengines)).
:- use_module(library(sandbox)).
:- use_module(pengine_sandbox:blawx).

server(Port) :- 
    http_server([port(Port)]).

And blawx.pl is defined as follows:

:- module(blawx,[blawxrun/4]).

:- use_module(library(scasp)).
:- use_module(library(scasp/human)).
:- use_module(library(scasp/output)).
:- use_module(aggregates).
:- use_module(dates).
:- use_module(events).
:- use_module(ldap).
:- use_module(passthrough).

blawxrun(Query, Tree, Model, Attributes) :-
    scasp(Query, [tree(Tree), model(Model), source(false)]),
    ovar_analyze_term(t(Query, Tree, Model), [name_constraints(true), name_prefix('Var_')]),
    (   term_attvars(t(Query, Tree, Model), [])
    ->  Attributes = []
    ;   copy_term(t(Query, Tree, Model), _, Attributes)
    ).

Running the commands described above, I’m still getting the same error message, except that it is looking for winner/2 inside blawx instead of inside user.

If I eliminate everything in blawxrun/4 except for the scasp line, I’m still getting the same error message.

You’re moving in the right direction. However, your scasp program will be loaded into the pengine_sandbox module, while you try to run everything from blawx. So, you must at least load scasp in the pengine_sandbox module. Next, you can make the scasp call from blawx to the pengine_sandbox module or, probably simpler, load all code into the penfgine_sandbox module.

Simplest way to get there is probably to remove the module header from blawx and use

:- ensure_loaded(pengine_sandbox:blawx).

If that doesn’t work, I’ll give it a try … Note that the scasp package also contains an HTTP server that works without Pengines. That is a way simpler way to use scasp over HTTP.

I’m sorry, I don’t know if you’re saying that I need to do :- ensure_loaded(pengine_sandbox:blawx). in the main scasp_server.pl file, and remove the module entry from blawx.pl, or if you are saying I need to do those two things AND add pengine_sandbox: before the scasp use_module command in blawx.pl? But I have tried both, and I have tried adding pengine_sandbox: in front of everything loaded inside blawx.pl. Nothing changes the result, it just changes the module in which it complains it cannot find winner/2. Now pengine_sandbox:winner/2 doesn’t exist.

I am very much in the dark, despite my best efforts. It doesn’t make any sense to me that calling a query inside the blawxrun predicate in the blawx module is “running it from blawx”, but that doing a use_module in the same file is somehow going to result in it being loaded into pengine_sandbox.

I’m happy to try something simpler if it’s actually simpler, but none of this makes any sense to me, and so if this is close to working, I’d rather not change bucking horses mid tsunami. If you could give this a quick try and let me know what you had to do to make the third query work, that would be GREATLY appreciated.

Can you make the whole thing (i.e., enough to reproduce) available, including something (e.g., a curl request) to test it? I think it is minor to fix it, but I don’t like reproducing the whole setup from this discussion.

Of course. I’ll get back to you.

WIth my thanks for your patience, here’s a little github repo with as clean an example as I can generate.

.

1 Like

Seems all that is needed is

:- meta_predicate blawxrun(0, -, -, -).

above the definition of the predicate. After all, this is a meta predicate.

Thanks for the clear reproducible case. That makes helping people a lot easier :slight_smile:

1 Like

Thank you. Could you give me some intuition as to why it is required, and what it is doing? In the hopes I might avoid or at least recognized similar problems in future?

Roughly, in good old plain Prolog, a term f(…) refers to the predicate (procedure) f/n. Given a module system, each module potentially defines its own f/n, denoted m:f/n. Now, if we pass winner(X,Y) to a predicate that must find a predicate from this term, such as sCASP, it needs to know into which module it must resolve winner/2. The meta_predicate/1 declaration tells the system which arguments must be annotated with the current module. These are the arguments annotated as 0…9, : or //. They all do the same: you pass winner(X,Y) and the called predicate is passed pengine_sandbox:winner(X,Y). The difference only affects the cross-referencing tools: 0 means it is a simple call. 1…9 means 1…9 arguments will be added before it is translated into a call. : means the target wants to know the module, but it will not call the result. Finally, // means the argument is a DCG goal (this is a SWI-Prolog extension).

2 Likes

Gotcha, thanks!