New hashtable library

With the new library(hashtable) and the following code:

hashtable_gen(N) :-
   ht_new(HT),
   forall(  between(1,N,I),
            (  ht_put(HT,I,I)
	    )
   ),
   assertz(hashtable(HT)).

hashtable_lookup(N) :-
   hashtable(HT),
   forall(  between(1,N,I),
            (  ht_get(HT,I,I)
	    )
   ).

hashtable_cleanup(N) :-
   hashtable(HT),
   forall(  between(1,N,I),
            (  ht_del(HT,I,I)
	    )
   ),
   retractall(hashtable(_)).

I get the following error:

1 ?- hashtable_gen(100).
true.

2 ?- hashtable_lookup(100).
ERROR: Arithmetic: evaluation error: `zero_divisor'
ERROR: In:
ERROR:   [12] _62620 is 8085836 mod 0
ERROR:   [11] hashtable:ht_get(ht(0,0,[](_62674)),1,1) at /home/u/tmp/swipl-devel/build.release/home/library/hashtable.pl:264
ERROR:   [10] hashtable_lookup(100) at /home/u/bld/prolog/tmp/db_bench.pl:170
ERROR:    [9] <user>
3 ?- hashtable_cleanup(100).
false.

4 ?- 

Also, how do you delete the entire hashtable? We have a predicate for deleting keys (ht_del/3), but how do we delete the entire hashtable?

I was planning to look at this library more carefully, but now in a hurry:

I don’t think you should be using forall/2 for this. This is a backtrackable data structure?

So, if you use ht_pairs/2 instead:

:- use_module(library(hashtable)).

:- dynamic hashtable/1.

hashtable_gen(N) :-
    numlist(1, N, Xs),
    pairs_keys_values(Pairs, Xs, Xs),
    ht_pairs(HT, Pairs),
    assertz(hashtable(HT)).

hashtable_lookup(N) :-
    hashtable(HT),
    forall(between(1, N, X),
           ht_get(HT, X, X)).

and now:

?- hashtable_gen(100).
true.

?- hashtable_lookup(100).
true.

But I don’t like this code either, because I don’t see a good reason to “save” the hashtable like this.

Also, why “clean” the hashtable yourself? Just throw it away and make a new one seems easy enough? (SWI-Prolog is garbage collected, right?)

If you have saved it like you did, I think the retractall/2 should be good enough to get rid of it.

Disclaimer: this is now a bit in a hurry, I am almost certainly overlooking something.

1 Like

OK, I think I was too hasty. If you actually want to test if you can add 100 values one by one, you’d have to write more “procedural” code, like this:

hashtable_gen(N) :-
    ht_new(HT),
    hashtable_add_n(N, HT),
    assertz(hashtable(HT)).

hashtable_add_n(0, _) :- !.
hashtable_add_n(N, HT) :-
    ht_put(HT, N, N),
    succ(N0, N),
    hashtable_add_n(N0, HT).

@jan or whoever else can help: I admit I do not quite understand the code snippet in the docs to ht_put/5.

This:

ht_put_list(HT, Key, Value) :-
    ht_put(HT, Key, [Value|Tail], [], Tail).

It says, “This can be used to bootstrap managing a list of values”. Do you mean a list of values associated with a key? So you add a new value to the front of the list already associated with this key?

Maybe SWI-Prolog should introduce SICStus do-loops. forall/2
is not declarative enough and foreach/2 isn’t extremly efficient.

SICStus do-loops would kill two birds with one stone, at least
for those cases where the generator is supported by the construct.

Yes. I don’t even know how to use foreach/2 to emulate calling ht_put/3 N times in a row. How would you do it?

PS: the way I see it, forall/2 is the leakiest possible abstraction for \+ (Condition, \+ Action) (and it is not advertised as anything more than that!).

It also seems that there could be a helper predicate for adding a list of Key-Value pairs to an already existing hashtable (does it exist already? didn’t find it). Something like:

ht_put_list(HT, List) :-
    maplist(ht_put_pair(HT), List).

ht_put_pair(HT, Key-Value) :-
    ht_put(HT, Key, Value).

The semantics are up for discussion though… what should happen with duplicate keys? Replace? Fail? Add to a list? Maybe this is why it is “missing” :slight_smile:

I don’t think you want to assert hash tables in general.

This is a bug in ht_get/3. Pushed a fix. Thanks!

The docs say so :slight_smile: It is a term, just like library(assoc(, etc.

Yes. More in general, you compose a new aggregated value from the already existing one or the IfNew argument if there is no existing one. You can also use this to count, as in

ht_put(HT, Key, New, 0, Old),
New is Old+1.

These loops are from Joachim Schimpf (ECLIPSe). I have an implementation based on the original paper (which is also used by SICStus). I played a bit and in general got quite confused with several corner cases. ECLIPSe seems to have improved on that later. The current code is quite deeply depending on ECLIPSe though, so I left it for now. I’m not against logical loops, but mostly as a portability helper.

I thought it was as simple as this, but this does not work because foreach/2 copies the hash table.

?- ht_new(HT), foreach(between(1, 10, X), ht_put(HT, X, X)).

Not sure. As you state, there are several possible variations. These are dealth with using the ht_put variations. We do not really want to duplicate this. Also the much older library(assoc) has translation from pairs both ways, but no incremental adding of a list.

2 Likes

do-loop wouldn’t copy. Unlike foreach/2 it does not copy at runtime,
but does a rewriting at compile time. It uses some aux predicate.
I think library(yall) also generates aux predicates?

An other solution to implement do-loops would be to use
engines for the generator. You could then realize the aux predicate
once, as an internal helper, and it would call the ht_put/3

through ordinary call, still not copying. But the engines
might copy something from the generator. So be careful
if your generator would for example enumerate

hash tables it would still not work. Back to do-loops. Picat
has even more advanced loops. Another advice for the
Prolog programmer could be just

manually write the loop, period. Or have a way to instruct
foreach/2 to treat some variables as shared, kind of “global”
loop outer variable instead of a “local” loop inner variable.

This idea is still in its fancy, and the problem is again that
at runtime the foreach/2 might not see any variables anymore.

Edit 23.07.2020:
The engine thingy is an old idea, I never completed a
realization of this potential use case. Maybe because
of the copy step?

Yeah sorry for the wrong attribution. Recently I am tempted
to attribute things to SICStus Prolog whereby they seem to
have been originated in ECLiPSe Prolog.

But maybe they are just Prolog folklore, everybody uses
the underlying loop pattern all the time.

Logical Loops, Joachim Schimpf, ICLP 2002
http://eclipseclp.org/reports/loops.pdf

library(yall) translates if the expression is found and translated by goal_expansion/2 and uses copy_term/2 otherwise. So, the semantics when dealing with destructive assignment in terms is rather poorly defined. Hash tables, being terms subject to destructive assignment, should be handled with care wrt. many high level constructs, some of which making implicit copies.

Yes, I agree that for application code this is true. The code snippet I showed was simply for a micro-benchmark that I did for different data structures. and I wanted to add hashtables to it.

I thought it was a bug, thanks for the quick fix!

EDIT: Is there a predicate to completely remove the hashtable? i.e. the opposite of ht_new/1?

With the same code as above, now (after the fix) I am getting:

1 ?- hashtable_gen(100).
true.

2 ?- hashtable_lookup(100).
false.

3 ?- hashtable_cleanup(100).
false.

I didn’t expect line 2 and 3 to fail. Am I missing something?

You cannot use forall/2 for this. You also cannot use foreach/2. You need to do the loop yourself, for example (quoting myself ;-):


Try the following, to see what happens:

?- ht_new(HT), ht_put(HT, a, 1).

and now try as if it was inside forall/2:

?- ht_new(HT), \+ ( \+ ht_put(HT, a, 1) ).
1 Like

You shouldn’t need to do that. SWI-Prolog has garbage collection.

If you really insist, you could probably do something like:

setarg(1, HT, 0),
setarg(2, HT, 0),
setarg(3, HT, [](_)).
1 Like

Can library(hashtable) be used together with concurrent_forall/2 ? Like in
read-only mode? Possibly yes, but I guess it would use n full
copies where n is the number of threads. Right?

Could a hash table concurrently be shared a little bit more? What
about a concurrent_do/2, would such a beast exist?

Edit 23.07.2020:
I got a compromise in my system. There is
not only balance/1, but also setup_balance/1.
The difference in arguments is as follows:

balance/1: Takes G, T i.e. G=generator, T=test
setup_balance/1: Takes S, G, T, i.e. S=setup,
G=generator, T=test

The idea is that S does some setup, like
a hashtable for example the hashtable_gen/1.
The main application scenario of setup_balance/1

is CLP(FD). The setup can be a model setup.
Like a queens problem, you can setup n-different-
queens sub-problems, and then solve them in

parallel, where n is the number of threads. The
test would be the labeling, and it can use the generator
value to takle a sub-problem over the same local

model which is only generated once per thread.

Thanks, that clears it up for me; although I didn’t expect it givent that HTs are mutable structures.

It is just a Prolog term like all others and thus, when using threads, any thread gets a copy and any subsequent modification in the thread only applies to this specific copy.

Mutable, but backtrackable. This is the difference between setarg/3 and nb_setarg/3 (“nb” stands for “non-backtrackable”). You should read the whole section probably, “Non-logical operations on terms”.

I had myself tried to read it carefully more than once and didn’t quite “get it” for a long time.

It would be nice if there were a consistent set of predicates for
hashtable
dict
assoc
rbtree

For example, rbtree lookup has the key as the 2nd arg; the others have it as the 1st. (I prefer it as the 2nd because that plays nicer with predicates like maplist/3). Also, each of these libraries has a slightly different way of iterating or backtracking over all values. And little details, such as dict_create/3 is probably redundant. Etc.

Also: any benchmarks on hashtable versus assoc/rbtree?

Database micro-benchmark

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)