Is there an easy way to view "stack traces" with the interpreter?

That’s really neat. I’ll try something like this and let you know how it goes.

I’d be curious how to table such a thing, though. For a regular list, I know how to use answer subsumption, but not for a difference list.

Also, if anyone likes topology and wants to see what this actually is, check out https://topology.pi-base.org. Contributing is a lot of fun (usually in the form of adding new properties/spaces or proofs for existing spaces to have some property). I am not a lead maintainer of the project, though, and so this is just an avenue of exploration and fun.

This is running great right now! Thank you to everyone who encouraged tabling. It made a gigantic difference.

I still wasn’t able to use a difference list; I used a regular old list that I called “append” with (is that severely less efficient? I bet tabling lessens that impact anyway). But even with that inefficiency, it’s still extremely fast, especially after the first run where tables are generated.

I noticed that I was running out of private table space, and I’m going to look for a flag to increase the table space limit. For my use in this project, though, I also found a workaround. I ran out of table space specifically when I was trying to find every object with a given property (i.e. trait(X, p000001, V)), which makes sense, considering that all of these theorems are quantified over the space X. So I have a space_declared predicate that tells me all the spaces that I have in the database, and any findall query first will call space_declared to force Prolog to not worry about anything other than those 200-ish spaces. I said I wouldn’t do it, but with tabling it’s not so bad.

Also, I may have found a SWISH bug? It doesn’t happen on the local interpreter: I get no errors there.

neg and many other procedures are said to not exist, and yet can be called no problem: neg(A, B). works just fine. The term expansions work too. It all works, but there’s an error nonetheless.

Tabling does use memoization. Even with subsumption
and modes, you are kind of enumerating and storing
a larger number of Self Avoiding Walks (SAWs).

@kuniaki.mukai was recently using hours and
giga bytes of memory, to compute something similar,
i.e. Grid Graph Path Counting A007764,

this series somehow grows 2^(n^2). Since you use
triples trait(X,P,V) your work is also related to the
Resource Description Framework (RDF).

I already suggested @kuniaki.mukai to use hard disk,
in particular streaming algorithms. One could elaborate
this idea again here, in case the use case is still

statically cataloging something.

Thank you! Yeah, it would make sense that memoizing data would cause a lot of space to get used up.

In the long term, statically cataloging everything is, I think, the best idea, and then tabling works flawlessly. It’s not possible right now, and I can’t do much about that, but it is the best idea.

Until then, I was able to reduce the memory usage by having two versions of trait/N: one that has a Derivation parameter and one that does not. The one without a Derivation parameter is called much more frequently in practice, and used for basically all queries involving variables. The one with a Derivation is basically only called on atoms, and so tabling doesn’t memoize as much at a single time.

Thanks for all the time you took to discuss this :slight_smile:

It depends on the number of elements that have to be iterated through. As an example: If the number of elements is known, can append (to the start of a list) efficiently using e.g.:

?- T = [4, 5, 6], L = [1, 2, 3|T].
T = [4, 5, 6],
L = [1, 2, 3, 4, 5, 6].

Tabling is intriguing and not always intuitive if you come from “normal” Prolog. Unfortunately there is no good public text that explains how to use it under which conditions. Some hints:

  • You must table predicates that are subject to “left recursion” (call a variant of the original goal). If it concerns mutual recursion (a → b → c → a), tabling one of these is enough to make the program behave.
  • You must table predicates for which you want sound (Well Formed Semantics) negation using tnot/1.

That is for sure. Other predicates are optional as tabling does not modify the semantics for these. It is only memoization, so you trade between space and time. SWI-Prolog’s tabling overhead is (still) fairly large, so limit yourself to predicates that are slow to recompute. Use profile/1 to figure out your bottlenecks.

There are some funny things in tabling. If we have a recursive predicate, we normally first have the base clause(s) and then the recursive clause(s). Using tabling you can swap these around as it won’t loop anyway. In fact, it often leads to fewer variants stored in the table and thus less table space.

You may wish to install GitHub - JanWielemaker/my-prolog-lib: Personal SWI-Prolog utilities library · GitHub, which provides library(tdump) and library(tstat) that provide some toplevel utilities that show space time and status of tabled predicates.

1 Like

Thanks for sharing. Now the problem domain becomes more
clear. Ok, you can reduce the explosion. In that you don’t
adopt what I did here, namely expanding and using all facts:

You could just add one single fact only:

trait('space', _, _, asserted).

And then if you derive a new theorem, replace 'space' by X.
This way the catalogue would first generate new Prolog rules
only, without instantiating them with Prolog facts.

Instantiating the many Prolog facts with the Prolog rules,
is probably responsible for a lot of memory usage. All theorems
only talk about one single subject, and so the derived

theorems will also only talk about one single subject. This
could satisfy a web site that shows derived theorems. You
could still instantiate derived theorems, where possible,

in a second processing step, if applied theorems are required.

There’s a note about tabling and DCGs in the XSB documentation, concerning potential quadratic behaviour (this would also apply to anything that uses difference lists and possibly also to anything that uses regular lists and append/3):

I don’t know if this also applies to SWI-Prolog.

I think yes. XSB has a work-around for some of this called interning. That is not (yet) implemented in SWI-Prolog.

Proof tracking respectively certifcate generation can be
done much easier with a tree. You would just
replace a rule:

trait(X, p000145, true, t000030) :-
	trait(X, p000011, true),
	trait(X, p000018, true).

By a rule, generating the proof tree. But I don’t know
how tabling friendly this is. You would need to calculate
the tree size to get shortest proofs:

trait(X, p000145, true, t000030(P,Q)) :-
	trait(X, p000011, true, P),
	trait(X, p000018, true, Q).

Meaning you don’t need DCG respectively the I and O
arguments at all. The result will be a Curry-Howard
formula as types justification by custom compounds

such as t000030(_,_). That is one would have:

G |- T : trait(X, P, V) <=> G |- trait(X, P, V, T)

Where the G |- T : A is a typed derivation notion. I guess
through forms of interning, you can avoid copying trees during
memoization and increase sharing.

All theorems talk about one single subject.

I should clarify that I am deliberately trying to avoid using this assumption. The main point of this implementation is to make it possible for this to not be true, which is something that I certainly wasn’t clear about. My bad!

I’m going to support at least relation(X, Y, P, V) so that you can express theorems like.

trait(X, lpc, true) :- relation(X, Y, open_subset, true), trait(Y, lpc, true).

If this feature was added, it would be used quite often, because theorems like this are a common scenario, as are proofs that use theorems like this. Having two variables might dash the hopes of not using lots of space because now the search space itself is “quadratic” in a way, in addition to being more “entangled” (a proof of something for one space can cause you to generate a proof of something for another space)—but that’s okay.

I think I get what you’re saying about this overall, though: have Prolog derive the theorems generally rather than immediately trying to instantiate them.

I don’t see how that helps, though, even in the case where we assume all theorems only have one variable in them—when you derive a theorem, you’d need to find the assumptions of the theorem, and that would lead to the same explosion (it’s necessary and common in pi-Base to have “A and B implies C”, and sometimes we even have a conjunction of three or four assumptions). But maybe it’s possible to derive, say, just theorems with a single assumption trait(X, p1, v1) :- trait(X, p2, v2), and that would still allow for a decent amount of sharing between theorems?

I also don’t quite understand what you meant when you said have only a single fact trait('space', _, _, asserted). This looks like you created a space which has every single property and every single value. I think that is what you intended to have, so that way this “fake space” could be used to derive any theorem. Does that sound at least somewhat right?

meant to reply to @peter.ludemann

Yes, it’s that. When you prove a goal, you prove a subgoal, which copies the subgoal’s list into a table (or at least copies the list after each append). So requesting a proof, and getting one, of length n, would seemingly cause the entire proof tree to be copied several times and result in n^2 complexity.

I need to spend some time thinking about XSB’s
solution.

But ignoring that for now, I think storing the proofs as a tree as @Johnny_Rotten suggested is a great idea. If Prolog lists are implemented as something like a linked list, then append is avoided, and the space complexity would become linear (that is, assuming that Prolog does not decide to make a deep copy of everything before putting it in the table).

Yes, it’s that. When you prove a goal, you prove a subgoal, which copies the subgoal’s list into a table. So requesting a proof, and getting one, of length n, would seemingly cause the entire proof tree to be copied several times and result in n^2 complexity.

I need to spend some time thinking about XSB’s
solution.

But ignoring that for now, I think storing the proofs as a tree as @Johnny_Rotten suggested is a great idea. If Prolog lists are implemented as something like a linked list, then the space complexity could become linear (That is, assuming that Prolog does not decide to make a deep copy of everything before putting it in the table.)

Now the suggestion of not instantiating variables makes more sense—we basically want to store the implication graph, and then, for fixed spaces, the property will either be asserted (represented by a leaf of the tree), or we query the graph to find out what to try next.

I think you use a neat trick , i.e. encode positive P and
negative literals ~P by a value V=true and V=false.
From a logical viewpoint your knowledge is not strictly

Horn clauses. Then your contraposition expands a rule:

A :- B1, .., Bn

Into multiple rules, known as Bernard Meltzers trick, by adding:

~B1 :- ~A, .., Bn
...
~Bn :- B1, .., ~A

So that ultimately your tabled backward chaining can deal
with clausal knowledge and not only Horn clausal knowledge.
Clausal knowledge means you deal with a knowledge

base that is in a kind of Conjunctive Normal Form (CNF) .
An alternative approach to tabling would be to use a
saturation based resolution theorem prover, or even ASP,

so that one would directly work with clauses:

%%%%% THEOREMS
kb([trait(_0, p000019, false), trait(_0, p000016, true)]).

I have no clue which approach is the better approach, quite
an interesting battle between titans.

It will. The table space is disjoint from the Prolog stacks. It must be as the data needs to survive backtracking. Tables are stored as tries. Actually, as a trie of tries, i.e., the outer trie stores goal variants and the inner tries store answers.

Possibly you can find some way to not store the entire proof, but individual nodes from which you can restore the proof? Not sure how, but that would be my line of thinking.

Yeah; I think that the current implementation does something entirely different from the approach I’m taking!

Got it. Hmm.

The problem is that the partial derivations do need to be tabled completely: otherwise the issue of cycle detection comes back.

Here would be a very beautiful world that probably has problems. I’m not going to implement it but I think the fact it should work is really cool.

Partial derivations would be stored like you’d expect. So implies(trait(X, p, v), trait(X, p2, v2), D) would have some value for D in the table: say

[theorem(trait(Y, p, v), trait(Y, intermediate, v)), theorem(trait(Y, intermediate, v), trait(Y, p2, v2)]

Here I’m using Y to represent a variable which is internal to the table.

Upon querying, we make a copy of this list with a fresh variable, and never unify the table variable Y with our input. Makes sense.

But in the table, if I look for implies(trait(X, intermediate, v), trait(X, p2, v2), D), then the table would have stored
[theorem(trait(Y, intermediate, v), trait(Y, p2, v2)]
with the same internal Y. That would allow for linear space complexity.

As far as I’m aware, this is sound, because every time we get a value from the table we deep-copy it anyway. And if we finished doing subgoal/n and goal/n backtracks, oh well, no goal/n for the table, but subgoal/n is doing just fine. After each subgoal finishes, we copy its value out for our use, then use it, and then, if our main goal succeeds, (somehow) copy the main goal’s value, keeping track of the subgoal’s value within it, and re-unify that subgoal’s value with the value from the table. Then put the main goal value (which has now been copied, and so the only thing it shares with is the subgoal value from the table), into the table. And we’re still disjoint from the stacks.

I don’t know the most general terms in which this optimization can be phrased, but it’s interesting. Of course it costs an extra unification.

Of course, that isn’t the real world. So I’ll think about other options later.

Edit: Well when I put it like that, it seems pretty clear that nodes totally would work. Oops!

In fact, all a node would be is just the clause that was used (and perhaps variable instantiations). So if I queried some trait—if it was successful, it would provide me the clause that it used to derive the trait. From there, I call each node in the body, which calls each node in its body, etc. to reconstruct a tree. If the proof fails, no node is returned.

Actually I was mistaken, optionally the %%%%% ASSERTED
TRAITS can be used together with the %%%%% THEOREMS.
One can use natural deduction:

    G |- A(c)                                         
----------------- (∀ Intro)   c not in G
   G |- ∀X A(X)                               

    G, B |- A
----------------- (:- Intro)
   G |- A :- B

So to deduce a new rule ∀X (A(X) :- B1(X), .., Bn(X)), you would use
an internal Y, which is indicated as c in the above above natural deduction
rule, and which I called ‘space’ and then proceed as follows:

/* Milkfoam Algorithm */
1) Add the facts B1(c),..,Bn(c) to the KB this gives KB’
2) Prove the goal A(c) from KB’
3) Now you can conclude ∀X (A(X) :- B1(X), .., Bn(X))

(∀-Intro) and (:- Intro) are typically found in λ-Prologs. It doesn’t assure
that B1(c),..,Bn(c) are consistent with the KB. i.e. that ~B1(c),.., ~Bn(c)
each are not derivable from KB, one would need to test that separately,

to make the rule not vacuously true. And my idea trait(‘space’, _, _) had
mostly exactly this defect, that it might happen that it somehow
assumes V=true and V=false for the same property simultaneously.

Did it.. Probably not the best job, but I did implement nodes. Not quite sure if it saves space for the average user.

The trick’s clever though. Theorems are (after expansion) derived(trait(X, property, value), id, CallerInfo) :- stuff. ID is a (mostly) unique ID for the clause; CallerInfo is an out parameter containing the IDs of the next clauses you need to call to get a successful derivation. You then can write a loop that queries a theorem,

I think, because tables are stored based on call variants, though, you need to make sure the ID parameter is ground. Certainly I’ve gotten incorrect results when the ID isn’t ground, so I just ground it.

I’ll let y’all know what the current implementation uses, since @Johnny_Rotten mentioned a lot of other algorithms. Probably it’ll be efficient to implement whatever’s being used now.

1 Like