Parallelising or fact lookup -- does it make sense

grossdan · December 23, 2020, 3:36am

Hello,

I am defining a node and a link and a “superclass” NodeOrLink. To lookup if a node or link exists i am using three predicates:

node_exists(Node) :- node(Node).
link_exists(Link) :- link(Link, _, _).

and the generic one:

node_or_link_exists(X):-  node(X); link(X, _,_).

The latter is in the worst case two failed hash lookups in series. I am now wondering, what if I created a thread for each as so:

node_or_link_exists(X) :-
   must_be(atom, X),
   thread_create(node(X), Thread_1),
   thread_create(link(X, _,_), Thread_2),
   thread_join(Thread_1, Status1),
   thread_join(Thread_2, Status2),
  (Status1; Status2).

For a large kb with lots of nodes and links – would this be making sense to do to reduce hash lookup time?

Btw, if it does makes sense, then I am thinking that this could be improved – the thread_join wait for both to complete – but, if one completes with true, there is no need to wait for the other … not sure how to improve the code for that with the given thread API …

Edit:

Btw, another question is if there is anything gained - speed wise – if i unify node and link, into a new asserted fact – node_or_link/1 – creating a third lookup hash table – would the joined hash table lookups perform about the same as two run in parallel – i.e. anything significant to gain for the cost of another indexed fact.

Dan

jan · December 23, 2020, 8:31am

Creating and joining a thread is fairly expensive. Well, compared to looking up a fact. Depending on the OS and hardware you may get numbers like this (Ubuntu 20.04, AMD 3950X):

?- time(forall(between(1, 10 000, _), (thread_create(true, TID), thread_join(TID)))).
% 50,001 inferences, 0.220 CPU in 0.517 seconds (43% CPU, 226996 Lips)

Besides that, this approach turns the lookup into the equivalent of once/1, loosing logical properties.

That is possibly more interesting. It depends though. First of all on link/3. If the 2nd and 3rd arguments are large terms the call link(X,_,_) can be expensive. If they are merely atoms or integers it is not a big issue. Second, you get rid of the disjunction, making the call immediately fail or succeed deterministically. Depending on how it is used (does the choice point live long?), this can make a significant difference.

P.s. It would be interesting to be able to compile clause such as below in such a way that the 2nd and 3rd argument are not materialized.

link_exists(Link) :- link(Link, _, _).

grossdan · December 23, 2020, 9:15am

Indeed, that is what i am seeking to achieve … to remove the choice point that exists merely due to the generalization semantics. And, if there is already a choice point, then perhaps it can be optimized through parallism.

Dan

Topic		Replies	Views
Speeding up "ajdacent" fact lookup Help!	4	298	December 13, 2021
Improving performance for fact lookup of graph data; through a simple extension Request For Comments	3	432	January 29, 2020
Clause lookup with Reference vs term lookup w/ first argument Help!	5	347	November 25, 2019
Hybrid dynamic asserted and backtrackable predicate arguments Request For Comments	0	353	June 28, 2020
Insight -- meta programming Help!	3	488	March 6, 2019

Parallelising or fact lookup -- does it make sense

Related topics