Dynamic subsumptive table/1 predicate

The docs say:

Although table/1 is normally used as a directive, SWI-Prolog allows calling it as a runtime predicate to prepare an existing predicate for tabled execution.

What they don’t say is how to do so and make it subsumptive - adding ‘as subsumptive’ as you do with the static predicate doesn’t work.

I’m afraid that is as yet not supported …

Ah, not just me then :slight_smile:

What I’ve done as a workaround is to set the global table_subsumptive flag before making the dynamic table/1 calls, then to set it back to the original value afterwards. Will that have the same effect?

I could also make the table/1 calls into directives, but my app can sometimes need to delete & reload the data used to generate the tabling. I’m using abolish_module_tables/1 for that but I’m not clear if that deletes the table contents, or both the contents and the tables, hence the dynamic calls to table/1 after the abolish_module_tables/1. If there was a way to clear just the contents of a table so they are regenerated after the new data is loaded, I could use that, but I don’t know if such a thing exists?

Also, I’m assuming that subsumptive tabling is correct in the first place for my use case. The app loads data from an external file and asserts fact from it. A second level of predicate then derives from them, e.g. “If there’s a fact A and fact B with these attributes then C is true”. I have subsumptive tabling for all the Cs, which are invariant unless the source data is reloaded, in which case I want to clear and regenerate the tables.

I think I might be able to answer this part of my question myself :wink:

From looking more closely at the tabling docs, untable/1 entirely deletes the table definition and its contents whereas the various abolish_* predicates just delete the table contents. To me “abolish” is an unfortunate verb choice as it implies complete removal, “clear” might have been better, but whatever…

I see. The abolish_*tables to abolish the tables. That is something else than the tabling for the affected predicates though. The tables are recreated when an appropriate call is started. Tabling adds a wrapper around a predicate that (simplified) does this:

  • If there is a table for the current variant (or we do subsumptive tabling and the variant is subsumed by some table)
    • If the table is complete, use it.
    • Else delay, waiting for the table to be complete
  • Else create a table, fill it (complete it) and generate answers from the (completed) table.

If tables depend on dynamic data you normally define them as incremental. The system hooks into assert/retract and invalidates affected tables and incremental tables that depend on affected tables. If a call comes in it will reevaluate the invalidated tables.

If you do not want to depend on this, simply abolish all affected tables. Note that you can abolish any table at any time without semantic implications: if it is needed it will be rebuilt.

If you know everything needs to be reevaluated abolishing all tables is probably ok. Incremental tabling comes with a lot of admin to track the dependencies. In many cases an assert/retract will only affect some tables and incremental tabling minimizes what needs to be reevaluated.

Except for interactively evaluating the effect of (not)tabling some predicate(s), there is never a need to call table/1 at runtime in your program. Most systems do not allow calling table/1 as a predicate.

Note that subsumptive tabling is more expensive. It is profitable in some scenarios, notably to reduce the number of tables. I’d always start with normal variant tabling and look only into alternatives if this uses too much memory or is too slow.

1 Like

I load the data and assert it all up-front, then it’s static (no more asserts) until it’s thrown away and re-loaded (which isn’t always the case). I don’t need tabling to be active during loading, I can wait until after the data has all been asserted, and from what you say it sounds like that will be faster than incremental tabling.

So it sounds like what I should be doing is static table/1 definitions and then abollsh_* if/when the tabled data is reloaded - great, that makes thing nice and simple :slight_smile:

The system has a REPL that allows the loaded facts (~50k of them) and dependents (~25k) to be queried. Once the data is loaded I preload the tables by “touching” all the relevant predicates. e.g. with findall. I hide the tabling cost in system startup and I do it in parallel,

The system has a REPL that allows interactive queries, subsumptive tabling makes otherwise annoyingly slow queries (10sec or so) instantaneous as far as the user is concerned - it’s astonishingly good.

If I wanted to identify the difference in cost between normal and subsumptive tabling, what would be the best way?

Thanks for the advice, it’s much appreciated :slight_smile:

I see. Then subsumptive tabling makes sense. It may be slow though. If there are a large number of answers, selecting from the general table is a trie lookup. That works fine if the additional instantiated variables are first, but poorly if not. Consider this, which is I think what you are doing, no?

:- table p/2 as subsumptive.

p(X,Y) :-
    between(1, 1000000, X),
    Y is X+1.

w :-
    findall(_, p(_,_), _).

First of all, rather than findall/3 you can use forall(p(_,_),true). The way tabling works though, you can just as well call ignore(p(_,_)) because calling p/2 causes a completed table to be established before p/2 succeeds (or fails).

But now …

?- time(p(10000, X)).
% 10 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 517170 Lips)
X = 10001.

Good. But …

?- time(p(X, 10000)).
% 9 inferences, 0.030 CPU in 0.031 seconds (99% CPU, 296 Lips)
X = 9999 ;
% 7 inferences, 0.102 CPU in 0.102 seconds (99% CPU, 69 Lips)
false.

is slow and non-deterministic :frowning:

Try? It all depends on the use case …

If you want some more insight, you can install GitHub - JanWielemaker/my-prolog-lib: Personal SWI-Prolog utilities library

which provides the libraries library(dump) and library(tstat) and a number of other personal goodies.

My first attempt would be no warming and normal tabling. If that is too slow, find the culprits and use subsumptive tabling and/or warming for some selective predicates.

The facts I’m asserting during data load look like this:

db:resource(Id:atom, Type:atom, Age:int, Profile:atom, Path:atom, Attrs:dict)

Where (Id, Age) is always unique and Age is either 1 or 2, representing the state of the resource identified by Id at different points in time. Type is drawn from an enumeration of < 100. Profile & Path are used for filtering results if required. Attrs is a dict containing the attributes of the resource. These facts are all created during data load, around 50k of them.

There’s another fact type that’s also created during data load which records relationships between two resources, but just for the most recent resource records:

db:link(+SrcId:atom, SrcType:atom, +DstId:atom, +DstType:atom)

Finally there is 2nd-level tabled predicate with multiple instances, with an overall form similar to this, most are significantly more complicated, referring to attributes of multiple resources

symptom(Id, 'Unused resource') :-
  db:resource(Id, Type, 1, _, _, _),
  \+ db:link(_, _, Id, Type)

It’s these symptom predicates that are tabled subsumptively. Symptoms are then grouped by resource so you can get a list of everything that’s wrong with a resource, that’s also tabled.

Grouped symptoms are then filtered for reporting, e.g. "List all the symptoms of this sort where the resource type is A, the Profile is B and the Path contains the string ‘foo’. Typically there will be ~25k symptoms, listing a subset of interest when tabling is used is fast.

Thanks for the masterclass & tips :slight_smile: I’ll use ignore/1, I’ll compare tabling without subsumption and I’ll take a look at your goodies library as well.