Improving performance for fact lookup of graph data; through a simple extension

Hello,

I have a simple idea to improve performance for fact lookup, when facts describe links in a graph; and am wondering if this has merit.

What if a specialized atom and related data structure would be introduced that extends the current atom data structure to include a built-in list of pointers to other atoms; and a build-in predicate pointer/2 that retrieves (non-determinstically) atoms pointed to, such as so:

pointer(N, N_Next).

There would also be an assert_pointer/2 built-in predicate, such as: assert_pointer(n1, n2), that establishes a(nother) pointer from n1 to n2.

The idea is that a predicate lookup such as:

get_next(N, N_Next) :-
     pointer(N, N_Next).

With N bound to the atom n1, swi prolog would not use the usual hash lookup to get at pointer(n1, n2) (with an index pointing to functor, and arg1: n1; but would (non-deterministically) use the pointer list already stored in atom n1 to get to n2.

At its simplest, if N is not bound, then its illegal; and if N_Next is bound, then the predicate becomes a test whether n1 pointes to n2.

Alternatively, If N is not bound, then pointer/2 reverts back to a usual indexed lookup for N and behaves like a usual fact lookup. It fails if the related pointer fact was not asserted w/ assert_pointer.

If N_Next is bound, and N is not bound then then this could either revert to usual indexed lookup, or be illegal – to get to a doubly linked item one would need to assert another pointer(n2, n1).

pointer/2, would be a special purpose predicate to optimize basic lookup in linked atoms such as in graphs.

I think, this can’t be a simple external predicate since it would require extending / specialize the built-in data structure of an atom, to include a pointer list – to avoid the (costly) hash lookup within the external predicate to get at an externally defined pointer list from atom to atom.

Does this make sense?

thanks,

Dan

Hi,

I am now thinking that this can be simplified further.

Suppose, that its possible to add to an atom arbitrary data, say, a dedicated slot for a pointer, that can point to some memory that is allocated by, say, an external predicate – key is that the life span of the arbitrary data is bound to the at least as the life span of the atom (but can be shorter than that of the atom).

With such a facility, it seems that assert_pointer/2 and pointer/2 predicates could be developed as external predicates.

Dan

Looking up the source, it seems that this is the key data structure here:

Looks like an atom is “only” a pointer to an unsigned integer, and a term is a structure. So, i guess, this is highly optimized making it costly to add a pointer to an atom.

typedef uintptr_t atom_t; /* Prolog atom */

typedef union
{ int64_t i; /* PL_INTEGER /
double f; /
PL_FLOAT /
char * s; /
PL_STRING /
atom_t a; /
PL_ATOM /
struct /
PL_TERM */
{ atom_t name;
size_t arity;
} t;
} term_value_t;
https://github.com/SWI-Prolog/swipl-devel/blob/b932efe73a65e64cd03db35fa4c759484c425c71/src/SWI-Prolog.h#L185

As the comment says, this is for PL_get_term_value(). The atom struct is defined in pl-incl.h. But, you can create something atom-like with pointers using the blob interface. Alternatively you can look into the ffi pack, which provides memory management based on blobs in addition to calling arbitrary C functions in a dll/shared object.