Load Factor in SWI-Prolog library(nb_set)

j4n_bur53 · October 28, 2024, 8:04pm

Just out of curiosity was taking up this lead:

Low Load Factor (< 0.5): Reduces collisions, so lookups and insertions are faster but at the cost of more memory overhead. Hash tables are less densely packed, making the average lookup time closer to O(1).
High Load Factor (> 0.7): Increases collisions, especially in hash tables with open addressing or linear probing, which can degrade lookup and insertion to O(n) in the worst case. It saves memory but may require more rehashing as the table fills up.

So I made a little test with SWI-Prolog library(nb_set):

/* Load Factor 1.0, Current Implementation */
?- time((between(1,1000,_), test, fail; true)).
% 34,150,149 inferences, 1.687 CPU in 1.744 seconds (97% CPU, 20237125 Lips)
true.

/* Load Factor 0.75, Experiment Below */
?- time((between(1,1000,_), test2, fail; true)).
% 30,188,706 inferences, 1.578 CPU in 1.641 seconds (96% CPU, 19129477 Lips)
true.

Test cases were random insertion:

test :-
   empty_nb_set(X), 
   (between(1,1000,_), random(Y), Z is floor(Y*1000), 
   add_nb_set(Z,X), fail; true).

test2 :-
   empty_nb_set(X), 
   (between(1,1000,_), random(Y), Z is floor(Y*1000), 
   add_nb_set2(Z,X), fail; true).

The experimental load factor 0.75 was implemented with:

add_nb_set2(Key, Set) :-
    add_nb_set2(Key, Set, _).
add_nb_set2(Key, Set, New) :-
    arg(1, Set, Buckets),
    compound_name_arity(Buckets, _, BCount),
    nb_set:hash_key(Key, BCount, Hash),
    arg(Hash, Buckets, Bucket),
    (   member(X, Bucket),
        Key =@= X
    ->  New = false
    ;   New = true,
        duplicate_term(Key, Copy),
        nb_linkarg(Hash, Buckets, [Copy|Bucket]),
        arg(2, Set, Size0),
        Size is Size0+1,
        nb_setarg(2, Set, Size),
        (   4*Size > 3*BCount
        ->  nb_set:rehash(Set)
        ;   true
        )
    ).

jan · October 29, 2024, 12:34pm

Little clue. Thinking about it, I suspect that a closed table implementation is probably both faster and uses less memory.

Topic		Replies	Views
Ann: SWI-Prolog 9.3.14 Releases	5	213	November 2, 2024
Memory use for database Help!	16	913	June 19, 2022
New hashtable library Predicate	29	1399	July 26, 2020
Performance cost of dictionaries? General	14	299	September 18, 2024
Quick Load Files Split Topic discussion	7	1673	April 18, 2020

Load Factor in SWI-Prolog library(nb_set)

Related topics