New hashtable library

You shouldn’t need to do that. SWI-Prolog has garbage collection.

If you really insist, you could probably do something like:

setarg(1, HT, 0),
setarg(2, HT, 0),
setarg(3, HT, [](_)).
1 Like

Thanks, that clears it up for me; although I didn’t expect it givent that HTs are mutable structures.

It is just a Prolog term like all others and thus, when using threads, any thread gets a copy and any subsequent modification in the thread only applies to this specific copy.

Mutable, but backtrackable. This is the difference between setarg/3 and nb_setarg/3 (“nb” stands for “non-backtrackable”). You should read the whole section probably, “Non-logical operations on terms”.

I had myself tried to read it carefully more than once and didn’t quite “get it” for a long time.

It would be nice if there were a consistent set of predicates for
hashtable
dict
assoc
rbtree

For example, rbtree lookup has the key as the 2nd arg; the others have it as the 1st. (I prefer it as the 2nd because that plays nicer with predicates like maplist/3). Also, each of these libraries has a slightly different way of iterating or backtracking over all values. And little details, such as dict_create/3 is probably redundant. Etc.

Also: any benchmarks on hashtable versus assoc/rbtree?

Database micro-benchmark

Ahh I see, this is what threw me off. I had a mental model that the hashtable was a blob object produced in C and it had to be “released” somehow.

Indeed we get inconsistent results with the following code:

ht_put1(HT,I) :-
   ht_put(HT,I,I).

ht_get1(HT,I) :-
   ht_get(HT,I,I).

ht_del1(HT,I) :-
   ht_del(HT,I,I).

loop(_,0) :- !.
loop(Goal,N) :-
   N > 0,
   call(Goal,N),
   N1 is N - 1,
   loop(Goal,N1).

Query:

15 ?- ht_new(HT), loop(ht_put1(HT),10), aggregate_all(count,ht_gen(HT,K,V),Cnt).
HT = ht(10, 32, [](_12110, _12112, _12114, _12116, _12118, _12120, _12122, _12124, _12126, _12128, _12130, _12132, _12134, _12136, _12138, _12140, _12142, _12144, _12146, _12148, _12150, _12152, _12154, _12156, 7, 7, 1, 1, 9, 9, _12170, _12172, 2, 2, _12178, _12180, 8, 8, _12186, _12188, 3, 3, _12194, _12196, _12198, _12200, _12202, _12204, 10, 10, 4, 4, _12214, _12216, 6, 6, _12222, _12224, _12226, _12228, 5, 5, _12234, _12236, _12238)),
Cnt = 10.

15 ?- ht_new(HT), loop([I]>>ht_put(HT,I,I),10), aggregate_all(count,ht_gen(HT,K,V),Cnt).
HT = ht(0, 0, [](_15340)),
Cnt = 0.

So the new hashtable library doesn’t work with library(yall).

EDIT: IMHO, this makes it less useful, and would cause hard to debug problems if one is not familiar with the intricacies of the new hashtable library.

Dear @peter.ludemann,

I have updated the micro-benchmark code to support hashtables and these are the results (for version 8.3.4-11-g1db629e24):

          Database raw and crude micro benchmark           
                     1,000,000 entries                     
Database  Operation                              Wall time
--------  ---------                              ---------
nb        1st lookup  .......................... 0.137 secs.
trie      1st lookup  .......................... 0.203 secs.
rec       1st lookup  .......................... 0.262 secs.
consult   1st lookup  .......................... 0.316 secs.
asrt      1st lookup  .......................... 0.801 secs.
hashtable 1st lookup  .......................... 2.317 secs.

nb        2nd lookup  .......................... 0.135 secs.
trie      2nd lookup  .......................... 0.201 secs.
rec       2nd lookup  .......................... 0.259 secs.
asrt      2nd lookup  .......................... 0.301 secs.
consult   2nd lookup  .......................... 0.309 secs.
hashtable 2nd lookup  .......................... 2.361 secs.

nb        insert      .......................... 0.236 secs.
trie      insert      .......................... 0.267 secs.
rec       insert      .......................... 0.294 secs.
asrt      insert      .......................... 0.432 secs.
hashtable insert      .......................... 7.498 secs.
consult   insert      .......................... 23.479 secs.

However, the hastable numbers are skewed because I couldn’t use forall/2 and had to use a manual loop due to the backtracking nature of the hashtable library. The real numbers are probably better.

The full benchmark code is here:

Micro-Benchmark code
:- dynamic keydb/2,
           trie/1,
	   consult_db/2,
	   consult_file/1,
	   hashtable/1,
	   db_bench/2.

:- use_module(library(assoc)).
:- use_module(library(apply_macros)).

go :-
   writeln('Database raw and crude micro benchmark'),
   N = 1 000 000,
   format('   -> Testing ~:d entries~n',[N]),
   retractall(db_bench(_,_)),
   test_db(rec,N),
   test_db(nb,N),
   test_db(asrt,N),
   test_db(consult,N),
   test_db(trie,N),
   test_db(hashtable,N),
   print_header(N),
   print_results.


test_db(Name,N) :-
   atomic_list_concat([Name,'_',gen],Gen),
   atomic_list_concat([Name,'_',lookup],Lookup),
   atomic_list_concat([Name,'_',cleanup],Cleanup),

   setup_call_cleanup(
      (  writeln(Gen),
         benchmark_goal( [Name,insert], call(Gen,N))
      ),
      (  writeln(Lookup),
         benchmark_goal( [Name,'1st lookup'], call(Lookup,N)),
         write(Lookup), writeln(' (2nd time)'),
         benchmark_goal( [Name,'2nd lookup'], call(Lookup,N))
      ),
      (  catch( call(Cleanup,N),
                error(existence_error(procedure,_),_),
	        true)
      )
   ) .




                         /**********************
                          *    Db operations   *
                          **********************/
rec_gen(N) :-
   forall(  between(1,N,I),
            (  recordz(I, I)
	    )
   ).

rec_lookup(N) :-
   forall(  between(1,N,I),
            (  recorded(I, I)
	    )
   ).

rec_cleanup(N) :-
   forall(  between(1,N,I),
            (  recorded(I, I, Ref), erase(Ref)
	    )
   ).



nb_gen(N) :-
   forall(  between(1,N,I),
            %(  atom_number(A,I), nb_setval(A, I)
            (  nb_setval('500', I)
	    )
   ).

nb_lookup(N) :-
   forall(  between(1,N,_),
            (  nb_getval('500', _)
	    )
   ).

nb_cleanup(_) :-
   nb_delete('500').



asrt_gen(N) :-
   %retractall(keydb(_,_)),
   forall(  between(1,N,I),
            (  assertz(keydb(I,I))
	    )
   ).

asrt_lookup(N) :-
   forall(  between(1,N,I),
            (  keydb(I,I)
	    )
   ).

asrt_cleanup(_) :-
   retractall(keydb(_,_)).



trie_gen(N) :-
   trie_new(Trie),
   assertz(trie(Trie)),
   forall(  between(1,N,I),
            (  trie_insert(Trie,I,I)
	    )
   ).

trie_lookup(N) :-
   trie(Trie),
   forall(  between(1,N,I),
            (  trie_lookup(Trie,I,I)
	    )
   ).

trie_cleanup(_) :-
   trie(Trie),
   trie_destroy(Trie),
   retractall(trie(_)).



consult_gen(N) :-
   File = '/tmp/consult_db.db_bench',
   assertz(consult_file(File)),
   setup_call_cleanup(
      open(File,write,Stream),
      consult_write_terms(Stream,N),
      close(Stream)
   ),
   load_files([File]).

consult_write_terms(Stream,N) :-
   forall(  between(1,N,I),
            (  format(Stream,'~w.~n',[consult_db(I,I)])
	    )
   ).

consult_lookup(N) :-
   forall(  between(1,N,I),
            (  consult_db(I,I)
	    )
   ).

consult_cleanup(_) :-
   consult_file(F),
   retractall(consult_file(_)),
   delete_file(F).


hashtable_gen(N) :-
   ht_new(HT),
   loop( ht_put1(HT), N),
   assertz(hashtable(HT)).

hashtable_lookup(N) :-
   hashtable(HT),
   loop( ht_get1(HT), N).

hashtable_cleanup(N) :-
   hashtable(HT),
   loop( ht_del1(HT), N),
   retractall(hashtable(_)).

ht_put1(HT,I) :-
   ht_put(HT,I,I).

ht_get1(HT,I) :-
   ht_get(HT,I,I).

ht_del1(HT,I) :-
   ht_del(HT,I,I).

loop(_,0) :- !.
loop(Goal,N) :-
   N > 0,
   call(Goal,N),
   N1 is N - 1,
   loop(Goal,N1).


                         /**********************
                          *    Nice printout   *
                          **********************/
print_header(N) :-
   nl,
   format('~t~w~t~59|~n',['Database raw and crude micro benchmark']),
   format('~t~:d entries~t~59|~n',[N]),
   format('~w ~t~10|~w ~t~12+~t~48| ~w~n',['Database','Operation','Wall time']),
   format('~w ~t~10|~w ~t~12+~t~48| ~w',  ['--------','---------','---------']).

print_results :-
   bagof( b(Db,W),
          S^(db_bench([Db,Type|_], S), get_dict(wall_time,S,W)),
	  Res),
   sort(2,'@<',Res,Sorted),
   nl,
   maplist( {Type}/[b(Db,Wall)]>>
            format('~w ~t~10|~w ~t~12+~`.t~48| ~3f secs.~n',[Db,Type,Wall]),
            Sorted),

   fail.

print_results.



                            /****************
                             *    Helpers   *
                             ****************/

% Measure, print and store wall and cpu time
% used by Goal.
benchmark_goal(BenchMark,Goal) :-
   get_time(OldWall),
   statistics(cputime, OldCpu),
   call(Goal),
   get_time(NewWall),
   statistics(cputime, NewCpu),
   UsedWall is NewWall - OldWall,
   UsedCpu  is NewCpu  - OldCpu,
   assertz( db_bench(BenchMark, stat{  cpu_time: UsedCpu,
                                       wall_time: UsedWall })),
   print_message(information, bench(UsedCpu, UsedWall)).

prolog:message(bench(Cpu, Wall)) -->
   { Perc is round(Cpu*100/Wall) },
   [ '~3f CPU in ~3f seconds (~w% CPU)' - [Cpu, Wall, Perc] ].

3 Likes

Interesting. I only compared the performance to the CHR hash tables. Here the result was close time-wise and much better memory wise. Hash tables should also compare positively to trees memory wise. Note that using -O (optimize) may help as hash tables require arithmetic while the others do not.

The other potential advantage is that the code is pretty easily moved largely to C.

Yes. Destructive assignment kills some of the beauty of Prolog. On the other hand, library(yall) (and foreach/2) performing a copy_term/2 on potentially large argument terms is not good either. I wonder whether that is avoidable, i.e., it does not need to copy these terms.

Here is the result with -O:

          Database raw and crude micro benchmark           
                     1,000,000 entries                     
Database  Operation                              Wall time
--------  ---------                              ---------
nb        1st lookup  .......................... 0.143 secs.
trie      1st lookup  .......................... 0.192 secs.
rec       1st lookup  .......................... 0.258 secs.
asrt      1st lookup  .......................... 0.804 secs.
consult   1st lookup  .......................... 0.805 secs.
hashtable 1st lookup  .......................... 1.629 secs.

nb        2nd lookup  .......................... 0.151 secs.
trie      2nd lookup  .......................... 0.192 secs.
rec       2nd lookup  .......................... 0.262 secs.
consult   2nd lookup  .......................... 0.295 secs.
asrt      2nd lookup  .......................... 0.297 secs.
hashtable 2nd lookup  .......................... 1.608 secs.

nb        insert      .......................... 0.243 secs.
trie      insert      .......................... 0.284 secs.
rec       insert      .......................... 0.340 secs.
asrt      insert      .......................... 0.443 secs.
hashtable insert      .......................... 5.033 secs.
consult   insert      .......................... 20.782 secs.

It improves things, but the hashtable is way slower than any of the other options.

If the code is moved to C, and the hashtable becomes an SWI-Prolog blob, wouldn’t this solve the problem of term copying, and in addition speedup things considerably?

EDIT: You already have foreign hashtable in C, but I this probably doesn’t backtrack which I guess was your main point with library(hashtable).

1 Like

Eliminating the need to copy the terms would be a great improvement, both semantically and also, probably, in terms of speed. I wonder, however, how could this be done.

Food for thought.

Perfect hash function

Using a perfect hash function is best in situations where there is a frequently queried large set, S, which is seldom updated.

Perhaps an option can be added to hashtable when the set is static or seldom updated. Obviously this would impact insert times negatively, but as we know the lookup times should improve and be consistent.


EDIT

In reading “Succinct Data Structures and Delta Encoding for Modern Databases” by Matthijs van Otterdijk and Gavin Mendel-Gleason and Kevin Feeney

Succinct data structures (Jacobson, 1988) are a family of data structures which are close in size to the information theoretic minimum representation.


EDIT

Source in C for minimal perfect hashing

Keeping the current data structure but doing most of the quite simple imperative work in C was what I was referring to. If you turn it into a blob you get a completely different beast that always needs to copy the keys and values and doesn’t backtrack. As is, neither keys nor values are copied. This may in fact change the figures considerably when keys and values are not just integers but large(r) terms.

Ideally you have the same semantics as the compiled version. This implies copying as well, but less. Whether that is possible, I do not know. Never really thought about that.

I pushed a branch foreach-no-copy that does this for foreach/2. This now happily works for hash tables. It reduces the overhead of foreach/2 significantly, although notably the pain of the findall/3 to collect all generator solutions still makes it slow. It also fixes a lot of the semantic issues with foreach/2. I have not merged this as I need some more time to be sure this really works. (ref) Test cases are welcome.

1 Like

Reran the benchmarks using foreach/2 instead of forall/2.
With the old regular foreach/2:


          Database raw and crude micro benchmark           
                     1,000,000 entries                     
Database  Operation                              Wall time
--------  ---------                              ---------
trie      1st lookup  .......................... 0.252 secs.
rec       1st lookup  .......................... 0.305 secs.
nb        1st lookup  .......................... 0.772 secs.
asrt      1st lookup  .......................... 0.864 secs.
consult   1st lookup  .......................... 0.872 secs.
hashtable 1st lookup  .......................... 1.643 secs.

trie      2nd lookup  .......................... 0.251 secs.
rec       2nd lookup  .......................... 0.306 secs.
asrt      2nd lookup  .......................... 0.356 secs.
consult   2nd lookup  .......................... 0.359 secs.
nb        2nd lookup  .......................... 0.778 secs.
hashtable 2nd lookup  .......................... 1.616 secs.

nb        insert      .......................... 0.305 secs.
trie      insert      .......................... 0.339 secs.
rec       insert      .......................... 0.410 secs.
asrt      insert      .......................... 0.582 secs.
hashtable insert      .......................... 4.852 secs.
consult   insert      .......................... 21.153 secs.

With the new foreach/2:

          Database raw and crude micro benchmark           
                     1,000,000 entries                     
Database  Operation                              Wall time
--------  ---------                              ---------
nb        1st lookup  .......................... 0.406 secs.
trie      1st lookup  .......................... 0.497 secs.
rec       1st lookup  .......................... 0.614 secs.
consult   1st lookup  .......................... 1.084 secs.
asrt      1st lookup  .......................... 1.087 secs.
hashtable 1st lookup  .......................... 1.612 secs.

nb        2nd lookup  .......................... 0.412 secs.
trie      2nd lookup  .......................... 0.503 secs.
consult   2nd lookup  .......................... 0.581 secs.
asrt      2nd lookup  .......................... 0.589 secs.
rec       2nd lookup  .......................... 0.671 secs.
hashtable 2nd lookup  .......................... 1.585 secs.

nb        insert      .......................... 0.586 secs.
trie      insert      .......................... 0.610 secs.
asrt      insert      .......................... 0.836 secs.
rec       insert      .......................... 1.059 secs.
hashtable insert      .......................... 4.585 secs.
consult   insert      .......................... 21.302 secs.

So overall doesn’t seem to be an improvement speedwise.
EDIT: both runs were using the -O switch.

1 Like

I guess the important question is, why was it initially implemented using copy_term/2?

Not entirely sure what you did, but the hash table tests doesn’t work with the old foreach/2. It just inserts each element in an empty hash table. The new foreach2 does get the correct result and is slower for that as inserting in a growing hash table deals with collisions and needs resizing. Finally, forall/2 is a lot faster. It always will be :slight_smile:

Because you need to call the different instantiations produced by the generator without backtracking (that is the whole point of foreach/2). The simple way to do that is like this, which obviously copies Goal using findall/3.

        foreach(Generator, Goal) :-
            findall(Goal, Generator, Goals),
            maplist(call, Goals).

The actual implementation was a bit smarter, using a findall/3 template based on the shared variables between Goal and Generator. Now we need to instantiate Goal with each instance of this template, but in Prolog we can instantiate a variable only once. So, a very dirty trick was needed that only reverts the instantiation of the template, preserving bindings created by Goal. I still need to think whether this is really correct. If anyone can shoot a hole in this idea, please do.

3 Likes

Merged into master. Docs:

foreach(:Generator, :Goal)
    True when the conjunction of  instances of  Goal created  from solutions
    for  Generator  is  true.  Except  for  term  copying,  this   could  be
    implemented as below.

        foreach(Generator, Goal) :-
            findall(Goal, Generator, Goals),
            maplist(call, Goals).

    The actual implementation uses findall/3 on a template created  from the
    variables shared between Generator and Goal. Subsequently, it uses every
    instance of this template to instantiate Goal, call  Goal and  undo only
    the instantiation of the template and  not other  instantiations created
    by running Goal. Here is an example:

        ?- foreach(between(1,4,X), dif(X,Y)), Y = 5.
        Y = 5.
        ?- foreach(between(1,4,X), dif(X,Y)), Y = 3.
        false.

    The predicate foreach/2 is  mostly used  if Goal  performs backtrackable
    destructive  assignment  on  terms.  Attributed   variables  (underlying
    constraints) are  an example.  Another example  of a  backtrackable data
    structure  is  in  library(hashtable). If  we care  only about  the side
    effects  (I/O,  dynamic  database, etc.)  or the  thruth value  of Goal,
    forall/2 is a faster and simpler alternative.  If Goal  instantiates its
    arguments it is will often fail as the  argument cannot  be instantiated
    to multiple values. It is possible to incrementally grow an argument:

        ?- foreach(between(1,4,X), member(X, L)).
        L = [1,2,3,4|_].

    Note that SWI-Prolog up to version  8.3.4 created  copies of  Goal using
    copy_term/2 for each iteration.