Web Prolog ISOBASE Test Cluster - Hack Away, Find Security Bugs

Thanks for trying it out! No, what we refer to as the ISOBASE profile doesn’t support code injection. Here’s the diagram showing how Web Prolog, very tentatively, has been split up into profiles:

As you can see, unless we are querying an ISOTOPE or ACTOR node, source code injection isn’t supported.

The ISOBASE profile provides two things: a stateless HTTP API (or “RESTful” if you want) which can be used from e.g. JavaScript, and the rpc/2-3 predicate which can be used from Prolog.

The shell is written in JavaScript and talks to the node over the stateless HTTP API. The interaction is of course restricted (no I/O etc), but that’s what you should expect for an ISOBASE node.

Suppose you’d make the following query:

?- movie(A,B).
A = american_beauty, 
B = 1999 <blinking cursor>

Now, if Damon decides to restart his nodes tonight, you would still be able to continue exploring the answers to your query tomorrow, and get the correct second solution, just as if the node had not been restarted:

?- movie(A,B).
A = american_beauty, 
B = 1999 ;
A = anna,
B = 1987 <blinking cursor>

That’s a true sign of a stateless API. Only the client keeps track of the state of the interaction, and it represents it with a pair consisting of a the current query and an integer, that’s all. The node doesn’t need to remember anything from the previous interaction.

The statelessness of the API makes it necessary to handle slow or non-terminating queries in a special way, as we cannot abort a query once it has been submitted. Here’s an example:

?- repeat, fail.
Error: Time limit (1s) exceeded.

To get an idea of how the stateless HTTP API works, you may want to select the following links. (How the responses are shown depends on your browser.)

http://one.prolog.computer:3010/ask?query=movie(A,B)&limit=5

Then let’s ask for the next five solutions:

http://one.prolog.computer:3010/ask?query=movie(A,B)&offset=5&limit=5

Adding a request parameter format=prolog gives us Prolog text rather than JSON:

http://one.prolog.computer:3010/ask?query=movie(A,B)&limit=5&format=prolog

The rpc/2-3 predicate is a meta-predicate for making non-deterministic remote procedure calls. It allows a process running in a node A to call and try to solve a query in the Prolog context of another node B, taking advantage of the data and programs being offered by B, just as if they were local to A.

That is why, as Damon showed, the following works from the shell at http://one.prolog.computer:3010

?- actress(M, scarlett_johansson, _), 
   rpc('http://two.prolog.computer:3010',director(M, D)).
D = peter_webber, 
M = girl_with_a_pearl_earring ;
D = sofia_coppola,  
M = lost_in_translation ;
... 

(It’s a silly example, of course, since director/2 is available at http://one.prolog.computer:3010 as well. We need to come up with better examples here.)

Here’s how rpc/2-3 is implemented on top of the stateless HTTP API:

rpc(URI, Query) :-
    rpc(URI, Query, []).
    
rpc(URI, Query, Options) :-
    parse_url(URI, Parts),
    option(limit(Limit), Options, 1),
    format(atom(QueryAtom), '(~q)', [Query]),
    rpc(Query, 0, Limit, QueryAtom, Parts, Options).
    
rpc(Query, Offset, Limit, QueryAtom, Parts, Options) :-    
    parse_url(ExpandedURI, [
        path('/ask'),
        search([query=QueryAtom, offset=Offset, limit=Limit, format=prolog])
      | Parts
    ]),
    setup_call_cleanup(
        http_open(ExpandedURI, Stream, Options),
        read(Stream, Answer), 
        close(Stream)),
    wait_answer(Answer, Query, Offset, Limit, QueryAtom, Parts, Options).

wait_answer(error(Error), _, _, _, _, _, _) :-
    throw(Error).
wait_answer(failure, _, _, _, _, _, _) :-
    fail.
wait_answer(success(Solutions, false), Query, _, _, _, _, _) :- !,
    member(Query, Solutions).
wait_answer(success(Solutions, true), 
            Query, Offset0, Limit, QueryAtom, Parts, Options) :-
    (   member(Query, Solutions)
    ;   Offset is Offset0 + Limit,
        rpc(Query, Offset, Limit, QueryAtom, Parts, Options)
    ).

The most important thing here is that the HTTP API is stateless. This is what makes it different from the HTTP API that’s available in library(pengines), and the pengine_rpc/2-3 that this library offers.

A naive implementation of the stateless HTTP query API is easy to build. For a request of the form <BaseURI>/ask?query=<Q>&offset=<N>&limit=<M> the node needs to compute the slice of solutions 0..N+M, drop the solutions 0..N and respond with the rest of the slice. Even a CGI script could do this, but this would be slow, not only because CGI by its nature is slow, but also since such a naive approach may involve a lot of recomputation. Computing the first slice (i.e. the one starting at offset 0) is as fast as it can be, but computing the second slice involves the recomputation of the first slice and, more generally, computing the nth slice involves the recomputation of all preceding slices, the results of which are then just thrown away. This, of course, is a waste of resources and puts an unnecessary burden on the node.

To implement the HTTP API in a way that makes it both stateless and scalable we use a novel kind of caching scheme that Jan W and I experimented with some years ago, and which has now been refined and reimplemented. It’s somewhat complicated, and I won’t describe it here unless you ask me to, but it seems to work just fine. One way to see that recomputation is avoided is to run the following query at http://one.prolog.computer:3010:

?- (sleep(0.9), X=a ; X=b).

You will then find that because of the call to sleep/1, the answer X=a will take around a second to appear, but when you ask for the second answer, it will appear immediately. This shows that the first disjunct isn’t recomputed.

2 Likes