Sandboxing modules

This is a continuation of an offline discussion with Jan on approaches to sandboxing modules so they can be run on pengines (and SWISH). A common approach is declare any “safe” exported predicates, e.g., using sandbox:safe_primitive/1 . This works fine but relies on the programmer to ensure that such predicates are indeed safe. A finer grained approach focuses on identifying the used external (and internal) predicates which are needed to ensure the callable predicates in the module are “safe”. In doing this, the sources of potential unsafe behaviour are limited and more easily identified.

The sandbox library defines a whitelist of system level predicates which are safe, but it’s not a complete list so it’s necessary to add to the list as the need is identified. In using the above approach, I’ve identified a few additions (see below) to the whitelist which are desirable for safeing pack(clpBNR). These can be added to the pack itself so they’re not showstoppers, but as a general principle, I think it’s better not to be declaring safe predicates in other (system) modules.

Suggested whitelist additions:

  1. create_prolog_flag/3 - Note that set_prolog_flag/2 is safe if the flag is declared safe, so I don’t think there’s any reason not to be able to create one. (There appears to be some confusion that this predicate is already declared safe, but I can find no evidence of that.)

  2. debug/1 and nodebug/1 - these enable and disable debug topics, the net effect is to “Add/remove a topic from being printed.” They have global effect so some may consider them unsafe. I don’t see there’s much difference between debug topic enable/disable and prolog flags (in reality they seem to be more limited than flags in their effect), so I think they should be whitelisted.

  3. prolog_current_frame/1 - this is an item from hacker’s corner which is recognized as safe but not particularly useful without prolog_frame_attribute/3 which is unsafe. But if it’s a safe system predicate, why shouldn’t it be in the sandbox whitelist and limit the scope of any safe_primitive declarations in the module, in this case, just to those predicates using prolog_frame_attribute. (Note that there are other mechanisms for getting a prolog frame, so these two predicates are not always used together.)

I’d have to check, but I think the flag remains part of the thread, making it safe. As it can also be used to modify existing flags though, this gets tricky. The sandbox rule would have to check the flag is indeed undefined. But, what is the point of a safe-declared create_prolog_flag/3 besides using it as a directive in the file. In SWISH, we most libraries at startup as most do not pass the safety tests when loaded from a SWISH query.

Prolog flags are thread-specific (and inherited from the thread that created the new thread). Enabled debug topics are shared by all threads. So, if someone does

?- debug(_).

in a SWISH query, the system will get noisy for everybody :frowning: Simply using a thread_local/1 predicate rather than a shared dynamic/1 is not an option as debug/1 is quite often used to get messages from other threads. Another issue is that debug/1 allows for debug(Topic > File), but we can block that more easily.

I think that the debug/1 should probably get a thread-local alternative that can be defined safe, after which we can map debug/1 calls in SWISH to this alternative. An alternative could be to have a Prolog flag that says whether debug/1 should operate thread-local or shared. Probably worth investigating as being able to use these debug facilities in SWISH could be worthwhile.

I don’t think I understand. And, AFAIK, the only ways to get a reference to a frame is this, the prolog_trace_interception/4 hook (which you cannot use as a SWISH user) and indirectly from a prolog_frame_attribute/3 call. All of this is unsafe when the user gets control of how these are called. What is safe is to have a local predicate as -for example- has_parent_goal(mypred/1) that you can build using these primitives. Next, you declare this to be safe. Make sure this predicate is either completely safe or is not exported. Note that the stack introspection predicates allow users to do really weird things :slight_smile:

I have no issues with your comments so:

  1. Move any create_prolog_flag calls to a directive. (They were in a local predicate called at module initialization time.)

  2. Accept that debug(Topic) is deemed unsafe which will mean some clpBNR features (e.g., tracing and debug message control) will not be available when running on pengines. (Subject to future mods to make debug topics thread-local, which I think is a good idea.)

  3. prolog_current_frame was only called when the debug topic was enabled. Since that can’t be done on pengines, I can declare the internal predicate that used it as safe.

I’ve corrected my notion of a safe module as one whose “safe” exported predicates pass the safe_goal test and whose “unsafe” exported predicates generate an error. (Sounds obvious but it took me a while to get here.) So a simple test on a loaded module to generate the safe and unsafe exported predicates:

test_module_safety(M,Safe,Others) :-
	findall(M:H,predicate_property(M:H,exported),Exported),
	safe_goals_(Exported,Safe,Others).

safe_goals_([],[],[]).
safe_goals_([Goal|Goals],[Goal|Safe],Others) :-
	catch(safe_goal(Goal),_Err,fail), !,
	safe_goals_(Goals,Safe,Others).
safe_goals_([Goal|Goals],Safe,[Goal|Others]) :-  % failed `safe_goal`
	safe_goals_(Goals,Safe,Others).

test_module_safety produces two lists of goals for module M; Safe is a list of safe goals for all values of their arguments and Others contains goals which do not pass this test. Therefore, goals whose safety depends on particular instantiations of their arguments will be in the Others list as well as any strictly unsafe predicates, so the safety check for calls to these goals must be done dynamically.

So if I run this on a development version of clpBNR, I get an Others list of [clpBNR:trace_clpBNR(_), clpBNR:print_interval(_, _)]. trace_clpBNR/1 is unsafe because it depends on debug topics and the underlying implementation uses debugger predicates.

print_interval/2 outputs to a stream (via format/3) so that’s potentially unsafe depending on the value of its stream argument: print_interval(current_output,Term) is safe but print_interval(OpenStream,Term) is not, so that requires a runtime check.

Next topic: Using safe_meta/2 for meta predicates: clpBNR supports a hook for users to define their own custom interval arithmetic primitives. They do this by defining a goal in the clpBNR namespace (analogous to the way other hooks are defined, e.g., exception/3 in user, or safe_global_variable/1 in sandbox). So it’ll look something like:

clpBNR:my_primitive(`$op`, InputArgs, OutputArgs, Persistent) :-
    % implementation of 'my_primitive/4' ... .

When a constraint is “compiled” into the constraint network, the existence of this predicate is checked (via a “safe” application of predicate_property/2). When the constraint network is subsequently “evaluated” for consistency the user primitive is invoked, but this requires the use of call to invoke the custom primitive. A safe_meta declaration is required - what should it be? Perhaps:

call_user_primitive(Prim, P, InArgs, OutArgs) :-  % wraps unsafe meta call/N
	call(clpBNR:Prim, '$op', InArgs, OutArgs, P).

sandbox:safe_meta(clpBNR:call_user_primitive(Prim, P, InArgs, OutArgs), []).

This works fine but doesn’t really check if the user defined primitive is itself safe. I think the safety of hooks like this is generally covered by (from the sandbox source code):

%!  safe_assert(+Term) is semidet.
%
%   True if assert(Term) is safe,  which   means  it  asserts in the
%   current module. Cross-module asserts are   considered unsafe. We
%   only allow for adding facts. In theory,  we could also allow for
%   rules if we prove the safety of the body.

So there’s no way to inject unsafe code (as a user defined primitive) into the clpBNR namespace from outside (e.g., a SWISH user). So I think this is fine

My second question about meta-predicates is just to validate the correct usage of safe_meta in the following (from the clpBNR_toolkit module):

%
% General purpose iterator: execute Goal a maximum of N times or until Test succeeds
%
iterate_until(N,Test,Goal) :- N>0, !,
	Goal,
	N1 is N-1,
	(Test
	 -> true
	  ; iterate_until(N1,Test,Goal)
	).
iterate_until(_N,_,_).  % non-positive N --> exit

sandbox:safe_meta(clpBNR_toolkit:iterate_until(_N,Test,Goal), [Test, Goal]).

so any external calls to iterate_until/3 will only be allowed if Test and Goal are safe, again not my responsibility. (test_module_safety puts iterate_until/3 in the Others list, so calls using safe (runtime) Test and Goal will be permitted.)

Final topic: I have some examples of finding optima of linear systems that use library(simplex). The Others list for this module is [simplex:maximize(_, _, _), simplex:minimize(_, _, _)] due to an unnecessary use of meta calls in the internal predicate pivot_row

pivot_row([], _, Bounded, _, _, Row, Row) :- Bounded.  %%% here
pivot_row([Row|Rows], PCol, Bounded0, Min0, Index0, PRow0, PRow) :-
        Row = row(_Var, Left, B),
        nth0(PCol, Left, Ae),
        (   Ae > 0 ->
            Bounded1 = true,
            Bound is B rdiv Ae,
            (   Bounded0 ->                            %%% and here
                (   Bound < Min0 -> Min1 = Bound, PRow1 = Index0
                ;   Min1 = Min0, PRow1 = PRow0
                )
            ;   Min1 = Bound, PRow1 = Index0
            )
        ;   Bounded1 = Bounded0, Min1 = Min0, PRow1 = PRow0
        ),
        Index1 is Index0 + 1,
        pivot_row(Rows, PCol, Bounded1, Min1, Index1, PRow1, PRow).

Since the only usage of this predicate has an initial value of false, the meta call can be eliminated with two simple modifications:

pivot_row([], _, true, _, _, Row, Row).               %%% change to this
pivot_row([Row|Rows], PCol, Bounded0, Min0, Index0, PRow0, PRow) :-
        Row = row(_Var, Left, B),
        nth0(PCol, Left, Ae),
        (   Ae > 0 ->
            Bounded1 = true,
            Bound is B rdiv Ae,
            (   Bounded0 == true ->                   %%% change to this
                (   Bound < Min0 -> Min1 = Bound, PRow1 = Index0
                ;   Min1 = Min0, PRow1 = PRow0
                )
            ;   Min1 = Bound, PRow1 = Index0
            )
        ;   Bounded1 = Bounded0, Min1 = Min0, PRow1 = PRow0
        ),
        Index1 is Index0 + 1,
        pivot_row(Rows, PCol, Bounded1, Min1, Index1, PRow1, PRow).

Now test_module_safety returns an empty list for Others (i.e.,library(simplex) is totally safe and no runtime checking is required).

Think I’m finally ready to test all this on a pengine.

1 Like