Unexpected garbage collections


#1

While doing some performance comparison in new 8.1.2 of tabling vs naive
caching, I’ve observed an unexpected behavior.

While in general naive caching is faster (I assume due to lack of
generality) this prolog source shows a very strange thing:

$ cat p.pl
:- table fib/2.

fib(0, 1) :- !.
fib(1, 1) :- !.
fib(N, F) :-
        N > 1,
        N1 is N-1,
        N2 is N-2,
        fib(N1, F1),
        fib(N2, F2),
        F is F1+F2.

:- dynamic fib_cache/2.

fibc(0, 1) :- !.
fibc(1, 1) :- !.
fibc(N, F) :-
        fib_cache(N, F),
        !.
fibc(N, F) :-
        N > 1,
        N1 is N-1,
        N2 is N-2,
        fibc(N1, F1),
        fibc(N2, F2),
        F is F1+F2,
        assertz(fib_cache(N, F)).

t1 :-
    abolish_all_tables,
    profile(fib(100000, _)).

t2 :-
    retractall(fib_cache(_,_)),
    profile(fibc(100000, _)).

$ swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 8.1.2)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit http://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

1 ?- [p].
true.

2 ?- t1.
=====================================================================
Total time: 1.100 seconds
=====================================================================
Predicate                       Box Entries =    Calls+Redos     Time
=====================================================================
$garbage_collect/1                       12 =       12+0        22.9%
$tbl_wkl_add_answer/2               100,001 =  100,001+0        16.8%
start_tabling/2                           1 =        1+0        12.9%
$tbl_variant_table/4                199,999 =  199,999+0        12.1%
is/2                                 99,999 =   99,999+0        11.5%
trie_gen/3                          199,999 =  199,999+0         5.8%
fib tabled/2                        100,001 =  100,001+0         2.8%
reset/3                             100,001 =  100,001+0         2.4%
$tbl_create_subcomponent/1          100,001 =  100,001+0         1.8%
$tbl_free_component/1                     1 =        1+0         1.5%
$tabling:work_and_add_answer/3      100,001 =  100,001+0         1.4%
$tabling:run_leader/5               100,001 =  100,001+0         0.9%
$tabling:delim/3                    100,001 =  100,001+0         0.8%
$tabling:done_leader/4              100,001 =  100,001+0         0.8%
$tbl_wkl_work/7                     100,001 =  100,001+0         0.8%
$tabling:completion/1               200,002 =  100,001+100,001   0.3%
$tbl_component_status/2             300,003 =  300,003+0         0.3%
$tbl_new_worklist/2                 100,001 =  100,001+0         0.3%
$tabling:add_answer_or_suspend/4    100,001 =  100,001+0         0.0%
$tabling:completion_step/1          200,002 =  100,001+100,001   0.0%
$tabling:early_completion/3         100,001 =  100,001+0         0.0%
$tabling:activate/4                 100,001 =  100,001+0         0.0%
$tabling:finished_leader/3          100,001 =  100,001+0         0.0%
fib/2                               199,999 =  199,999+0         0.0%
$sig_atomic/1                       100,001 =  100,001+0         0.0%
true.

3 ?- t1.
=====================================================================
Total time: 1.125 seconds
=====================================================================
Predicate                       Box Entries =    Calls+Redos     Time
=====================================================================
$garbage_collect/1                        9 =        9+0        22.8%
$tbl_wkl_add_answer/2               100,001 =  100,001+0        19.6%
$tbl_variant_table/4                199,999 =  199,999+0        15.7%
start_tabling/2                           1 =        1+0        12.2%
is/2                                 99,999 =   99,999+0         5.5%
trie_gen/3                          199,999 =  199,999+0         3.4%
$tbl_create_subcomponent/1          100,001 =  100,001+0         3.2%
fib tabled/2                        100,001 =  100,001+0         2.5%
$tabling:completion/1               200,002 =  100,001+100,001   2.3%
$tbl_table_complete_all/1           100,001 =  100,001+0         1.9%
$tabling:delim/3                    100,001 =  100,001+0         1.4%
$tbl_new_worklist/2                 100,001 =  100,001+0         1.3%
$tbl_pop_worklist/2                 200,002 =  200,002+0         1.1%
$tbl_free_component/1                     1 =        1+0         0.9%
$tabling:finished_leader/3          100,001 =  100,001+0         0.7%
$tabling:done_leader/4              100,001 =  100,001+0         0.6%
$tabling:work_and_add_answer/3      100,001 =  100,001+0         0.6%
$tabling:run_leader/5               100,001 =  100,001+0         0.4%
$tabling:activate/4                 100,001 =  100,001+0         0.3%
> /2                                  99,999 =   99,999+0         0.3%
$tbl_wkl_work/7                     100,001 =  100,001+0         0.3%
reset/3                             100,001 =  100,001+0         0.3%
$tabling:add_answer_or_suspend/4    100,001 =  100,001+0         0.0%
$tabling:completion_step/1          200,002 =  100,001+100,001   0.0%
$tabling:early_completion/3         100,001 =  100,001+0         0.0%
true.

3 ?- t1.
=====================================================================
Total time: 1.175 seconds
=====================================================================
Predicate                       Box Entries =    Calls+Redos     Time
=====================================================================
$garbage_collect/1                        9 =        9+0        21.6%
$tbl_variant_table/4                199,999 =  199,999+0        19.8%
$tbl_wkl_add_answer/2               100,001 =  100,001+0        16.2%
start_tabling/2                           1 =        1+0        10.2%
is/2                                 99,999 =   99,999+0         5.5%
trie_gen/3                          199,999 =  199,999+0         5.3%
$tbl_create_subcomponent/1          100,001 =  100,001+0         4.1%
$tbl_new_worklist/2                 100,001 =  100,001+0         2.7%
$tbl_free_component/1                     1 =        1+0         2.4%
$tbl_component_status/2             300,003 =  300,003+0         2.1%
$tbl_pop_worklist/2                 200,002 =  200,002+0         1.8%
$tabling:run_leader/5               100,001 =  100,001+0         1.5%
$tabling:completion/1               200,002 =  100,001+100,001   1.3%
$tabling:delim/3                    100,001 =  100,001+0         0.7%
$tabling:activate/4                 100,001 =  100,001+0         0.6%
$tbl_table_complete_all/1           100,001 =  100,001+0         0.6%
$tabling:completion_step/1          200,002 =  100,001+100,001   0.3%
$tabling:work_and_add_answer/3      100,001 =  100,001+0         0.3%
fib tabled/2                        100,001 =  100,001+0         0.3%
$sig_atomic/1                       100,001 =  100,001+0         0.3%
$tabling:add_answer_or_suspend/4    100,001 =  100,001+0         0.0%
$tabling:early_completion/3         100,001 =  100,001+0         0.0%
$tabling:done_leader/4              100,001 =  100,001+0         0.0%
$tabling:finished_leader/3          100,001 =  100,001+0         0.0%
fib/2                               199,999 =  199,999+0         0.0%
true.

3 ?- t2.
=====================================================================
Total time: 2.400 seconds
=====================================================================
Predicate                       Box Entries =    Calls+Redos     Time
=====================================================================
$garbage_collect/1                      850 =      850+0        85.1%
assertz/1                            99,999 =   99,999+0         7.3%
fib_cache/2                         199,996 =  199,996+0         5.3%
is/2                                 99,999 =   99,999+0         1.3%
fibc/2                              100,000 =        1+99,999    1.0%
> /2                                  99,999 =   99,999+0         0.0%
true.

4 ?- t2.
=====================================================================
Total time: 64.550 seconds
=====================================================================
Predicate                       Box Entries =    Calls+Redos     Time
=====================================================================
$garbage_collect/1                   22,357 =   22,357+0        99.0%
fib_cache/2                         199,996 =  199,996+0         0.6%
assertz/1                            99,999 =   99,999+0         0.3%
is/2                                 99,999 =   99,999+0         0.1%
fibc/2                              100,000 =        1+99,999    0.0%
> /2                                  99,999 =   99,999+0         0.0%
true.

4 ?- t2.
=====================================================================
Total time: 64.035 seconds
=====================================================================
Predicate                       Box Entries =    Calls+Redos     Time
=====================================================================
$garbage_collect/1                   22,357 =   22,357+0        99.1%
fib_cache/2                         199,996 =  199,996+0         0.5%
assertz/1                            99,999 =   99,999+0         0.3%
is/2                                 99,999 =   99,999+0         0.1%
fibc/2                              100,000 =        1+99,999    0.1%
> /2                                  99,999 =   99,999+0         0.0%
true.

4 ?-

While the first run of t2 runs in relative small time (although the GC
time is very dominant and run a lot of times), in next runs the GC seems
to go crazy and is triggered all the time.


#2

Interesting case. It clearly indicate GC scheduling needs to be revisited. In part this seems related to the change to combine the stack limits into a single limit. Fixing that only provides a modest fix though. Stay tuned.

Disabling GC (set_prolog_flag(gc, false)) gives on my internal version 0.9 sec for the assert and 1.1 sec for the tabled version. I’d expect the tabled version to win in most scenarios shortly. If, in due time, this will be combined with incremental tabling we should be able to use tabling as the primary mechanism to avoid recomputation.


#3

FYI. I’ve started adding some GC statistics collection and a hook that is called when GC takes more than (currently) 20% of the time. That gave some insight. What happens in the function is that there is a deep recursion and at the end all the arithmetic on quickly growing numbers is done. These big numbers fill the global stack rapidly, but except for the last couple, they can all be garbage collected. This makes the system happy to keep using GC. The problem is that each GC is expensive as it has to work itself through the (initially 100,000 levels) deep environment stack.

The new hook needs some policing. When done, it should allow is to define GC policy using Prolog rules and enforce policies to optimize for space, time or some balance. That was something on my agenda for a very long time but never reached the shortlist.

Stay tuned :slight_smile:


#4

The latest version on git now considers the sizes of the other stacks when deciding on a GC or stack expansion. That brings the GC overhead down to 30% on the t2 test.

There is also some new infrastructure to make controlling GC dynamically accessible from Prolog, but this is too immature to activate. If someone wants to explore this I’m happy to explain the state and a very tentative implementation of the hook.