Hi all,
we have started working on applying SWI Prolog and Liftcover for completion in knowledge graphs, you can find a summary of our efforts at
In summary, we were not able yet to be better than AnyBURL in terms of quality of the learned solutions.
But what surprised me was that SWI Prolog was way slower than PyClause in computing the quality metrics, which require finding all solutions to queries from simple rule bases. PyClause is a Pyhton interface to AnyBURL and AMIE and uses a C++ library for metrics computation.
Can you give some details? How do you represent the KG in Prolog and compute the metrics? Also, what does PyClause use? Note that if you use HDTs you can produce many statistics quite fast.
I created a minimal working example for one task: computing the confidence of a set of rules.
You can find the files at GitHub - friguzzi/lp4kg: Logic Programming for Knowledge Graphs
In folder wn18rr you can find the Wordnet dataset in Prolog and PyClause format.
You can associate a confidence to each rule with
% swipl confidence.pl
and
% python confidence.py
This is what I get on my machine:
% time python confidence.py
python confidence.py 23,65s user 0,37s system 99% cpu 24,238 total
% time swipl confidence.pl
swipl confidence.pl 116,61s user 1,52s system 99% cpu 1:58,86 total
``
Your machine seems considerably faster than mine. The original takes 3m32. After the above change and loading train.pl as .qlf file I get 20 seconds. A little more than 10 fold speedup (AMD3950X: 16 cores, 32 threads). That is a little disappointing. I would have expected a speedup closer to 20.
Needs more analysis to see how fair this comparison is and whether it can be improved significantly.
The funny thing is that this is slower, but it gives a much better speedup under concurrent execution, so in total it is about the same on my hardware. Both could be improved. distinct/2 using a low-level implementation for add_nb_set/3 and aggregate_all/4 by figuring out what causes the poor speedup.
Yes. Using your distinct2 with aggregate_all/3 is about the same as using aggregate_all/4. As I also found, this is faster, but scales less good when executed concurrently. The slowdown is partly because of the efficiency of the sort/2, where distinct/2 uses a pure Prolog implemented non-backtrackable hash table instead. Actually, Iâm surprised it is as fast as it is. Using distinct/2 gives near perfect speedup under concurrency, while the aggregate_all/4 version does not (getting 10 times from 16 cores/32 threads). That means there is some interaction between the threads. I wonder where that comes from. findall/3 has been decoupled quite extensively as far as I know.
Note that instead of using :-include(train)., by better use
:- ensure_loaded(train).
and prior run
swipl qlf compile train.pl
That saves nearly a second. Do the same trick for the other larger include files.
In general Iâm not really surprised. PyClause is fast, but it is a dedicated machinery for a single task in C++ as opposed to a generic machine. As we have seen, you can also speedup a couple of times by writing the Prolog code optimally and some more exploiting concurrency.
PyClause describes their algorithm fo âstandard confidenceâ as follows:
But this is just a conditional probability, frequentist expressed somehow:
# { HĎ e KG | B1Ď,..,BnĎ e KG } # KG n { HĎ | B1Ď,..,BnĎ e KG }
---------------------------------- = ----------------------------------
# { HĎ | B1Ď,..,BnĎ e KG } # { HĎ | B1Ď,..,BnĎ e KG }
But this vague specification leaves a lot of room for distinct/2 & friends.
What does âapproximationâ mean for PyClause? I donât know about this,
and how this could explain a discrepancy. In the worst case PyClause
has really a bug, maybe a C++ glitch in multi-threading?
Prolog is a little safer for multi-threading, you donât have buffer area
communication, which is sometimes a bad idea, when data doesnât get
fetched immediately, and is overwritten by new data skrewing everything.
Thanks @jan .
Sorry, I forgot to use concurrent_maplist in finding instantiations.
I also precompiled to qlf the input files.
Now I have
% /usr/bin/time swipl computeinsts.pl > inst.pl
2,95 real 9,92 user 0,56 sys
which is much better but still far from 1,2 s for all the process.
What do you think it is the culprit? Do you think it is unification? In this case we just have constants, so maybe the full unification algorithm is an overkill?
When you @friguzzi did talk about approximation in PyClause.
Did you mean that PyClause uses some sampling technique.
This could also explain why its much faster, not only why the
results are different. Would it be possible to implement such
sampling also in Prolog? Didnât PRISM by Sato, donât know
which version, already offer some sampling in the past?