Using monotonic tabling for RDFS entailment reasoning

SWI-Prolog got a new form of tabling some time ago: monotonic tabling. That remained in a rather experimental state for quite a while. It is now picked up again to deal with reasoning about C++ binary code together with CMU in the Pharos project.

To show the potential of this type of reasoning I wrote an RDFS entailment reasoner. We first translate the RDFS entailment rules to Prolog in the most straight forward way:

rdfs(P, rdf:type, rdf:'Property') :-                            % rdf1
    rdfs(_, P, _).
rdfs(X, rdf:type, C) :-                                         % rdfs2
    rdfs(P, rdfs:domain, C),
    rdfs(X, P, _).
rdfs(Y, rdf:type, C) :-                                         % rdfs3
    rdfs(P, rdfs:range, C),
    rdfs(_, P, Y).
rdfs(X, rdf:type, rdfs:'Resource') :-                           % rdfs4a
    rdfs(X, _, _).
rdfs(Y, rdf:type, rdfs:'Resource') :-                           % rdfs4b
    rdfs(_, _, Y).
rdfs(P, rdfs:subPropertyOf, R) :-                               % rdfs5
    rdfs(P, rdfs:subPropertyOf, Q),
    rdfs(Q, rdfs:subPropertyOf, R).
rdfs(P, rdfs:subPropertyOf, P) :-                               % rdfs6
    rdfs(P, rdf:type, rdf:'Property').
rdfs(X, Q, Y) :-                                                % rdfs7
    rdfs(P, rdfs:subPropertyOf, Q),
    rdfs(X, P, Y).
rdfs(C, rdfs:subClassOf, rdfs:'Resource') :-                    % rdfs8
    rdfs(C, rdf:type, rdfs:'Class').
rdfs(X, rdf:type, D) :-                                         % rdfs9
    rdfs(C, rdfs:subClassOf, D),
    rdfs(X, rdf:type, C).
rdfs(C, rdfs:subClassOf, C) :-                                  % rdfs10
    rdfs(C, rdf:type, rdfs:'Class').
rdfs(C, rdfs:subClassOf, E) :-                                  % rdfs11
    rdfs(C, rdfs:subClassOf, D),
    rdfs(D, rdfs:subClassOf, E).
rdfs(P, rdfs:subPropertyOf, rdfs:member) :-                     % rdfs12
    rdfs(P, rdf:type, rdfs:'ContainerMembershipProperty').
rdfs(X, rdfs:subClassOf, rdfs:'Literal') :-                     % rdfs13
    rdfs(X, rdf:type, rdfs:'Datatype').

% Link to the rdf triples:

rdfs(S, P, O) :-
    rdf(S, P, O).

Now, this won’t do much good in classical Prolog. The best you can do is create a fixed point algorithm that tries these rules and add all entailed new triples to the database. Repeat this process until no rule produces any new triple. If you add a new triple you must repeat this process to keep the entailed graph consistent. You can improve on that by ordering the rules to reduce the number of required iterations. That requires additional brain cycles from you though :slight_smile: .

Instead, we add these two declarations to the database (at the top of the file):

:- table rdfs/3 as monotonic.
:- dynamic rdf/3 as monotonic.

If we now add some rdf/3 triples to the database and ask for the entailed graph by querying rdfs/3 everything works by magic. This is because tabled evaluation watches for goals that are being evaluated that are a variant (equivalent goal after variable renaming) of the current goal. In that scenario it effectively reverts to bottom up evaluation while it only propagates new answers. This provides termination warants similar to Datalog: any program with a finite Herbrand universe terminates.

SWI-Prolog’s innovation is in the option “as monotonic”. Given this option the initial tabled execution constructs a dependency graph linking the dynamic data to the derived tables using continuations. As new facts are added to the dynamic predicates (rdf/3 here), these continuations are triggered and propagate the consequences (and possibly expand the dependency network if new rules can fire due to the new data). Effectively, our clean declarative rules have been transformed into an efficient forward chaining engine with no effort from us!

Prefixes

We borrow some code from SWI-Prolog’s Semantic Web (semweb) package to deal with RDF prefixes, making e.g., rdf:'Property' a plain atom at compile time (again, add above the rules):

:- use_module(library(semweb/rdf_prefixes)).

:- rdf_meta
    rdfs(r,r,o).

Experiment

I’ve carried out a small experiment using the AAT as RDF data (286,227 triples). First, we run rdfs/3, initializing the dependencies. Note that this has no solutions as there are no triples in our database.

?- rdfs(S,P,O).
 false.

Now we add our triples after reading them from RDF/XML:

?- load_rdf('aat.rdf', Triples), time(maplist(assertz, Triples)).
% 37,971,687 inferences, 6.301 CPU in 6.333 seconds (99% CPU, 6026215 Lips)

This results in 502,117 rdfs/3 answers. Now we can load the AAT schema:

?- load_rdf('aat.rdfs', Triples), time(maplist(assertz, Triples)).
% 12,198,444 inferences, 1.490 CPU in 1.491 seconds (100% CPU, 8185847 Lips)

Raising the number of rdfs/3 edges to 664,950. The memory used by Prolog after these steps is 766Mbytes.

If we reverse the order, first loading all rdf/3 data and then calling rdfs/3 to compute the entailment, the loading time is 0.7 seconds and the inference time is 4.9 seconds, resulting in the same 664,950 edges.
Batch computing is only 40% faster than incremental maintenance!.

Further steps

Transactions as implemented by transaction/1 provide isolation and atomic changes to the database. Consistent cooperation with tabling is experimental, but already works for the above scenario. This allows for efficient “what if” reasoning: add some rdf/3 facts inside the transaction, examine the entailed graph for certain properties and decide to commit or roll back.

Monotonic tables may be watched for new answers. The low-level mechanism using prolog_listen/3 is present. A more robust and simpler to use interface is under consideration.

Conclusions

Tabling greatly enhances the expressive power of Prolog for rule based reasoning. Monotonic tabling provides efficient maintenance of consistent results in a changing world given a monotonic rule set. SWI-Prolog also allows for “what if” reasoning and acting on newly inferred results.

Future work

Being limited to monotonic rules is a strong restriction and too strong for the Pharos project mentioned at the top. We are currently working on program transformation to transform a rule set with negation into two monotonic rule sets. One of these computes an upper bound while the other contains the negated answers.

Note

To work as advertised you need the next development release (8.3.14) or compile SWI-Prolog from source.

6 Likes

That is a good question. I think the efficiency comes from two places. The dependency graphs is between tables and (thus) based on call variants. If a new fact is added, either to the dynamic predicate or some dependent table, it is only forwarded over the matching variant dependency edges. Second, whether an answer is new or not is determined by the tables (tries). Unlike just running the native Prolog rules and see whether they produce any new answer, the propagation stops early if some intermediate result was already known.

And of course, clause indexing does its usual work for dealing with new calls to rdfs/3.

Note that I’m not claiming this is the most efficient (forward) reasoner ever. It is mostly flexible, easy to manage and fairly efficient. Maybe it is more. Consider the OpenRuleBench results comparing Prolog systems to various more dedicated rule engines. SWI-Prolog’s tabling is still relatively slow when compared to XSB, YAP, B-Prolog (Picat).

Should all be in today’s daily.