Memory accumulation on backtracking

emiruz · December 23, 2023, 9:00am

I’m searching a large graph in a depth first fashion, and the memory usage accumulates until I’m out of memory. I don’t really understand why. Here is some simple code for a depth first search to a finite depth:

walk(0, [Node|_], Node) :- !.
walk(N0, [Node|T], Final) :-
    N is N0 - 1,
    connect(Node, Next),
    \+ memberchk(Next, T),
    walk(N, [Next,Node|T], Final).

which I invoke like this, with the aim of counting all the final nodes:

aggregate_all(count, Final, walk(64, [Start], Final), Count).

Why does it run out of memory? How can I write it so that it does not?

j4n_bur53 · December 23, 2023, 9:07am

Oops, I feel responsible, since I was drumming for aggregates!

Because you use aggregate_all/4 and not aggregate_all/3.
aggregate_all/4 is still based on findall/3, doesn’t optimize count,
its not implemented via a state and some failure driven loop.

SWI-Prolog is open source. Just inspect the implementations,
the Prolog code for aggregate_all/4 starts at line 265:

GitHub Repository - SWI Devel
https://github.com/SWI-Prolog/swipl-devel/blob/1cb3a63efb08819f453432d760c7c83cab017cbb/library/aggregate.pl

Edit 23.12.2023
One way out that library(aggregate) is not optimized always would
be to use mode directed tabling in SWI-Prolog. With mode directed tabling
you can for example get an aggregate/3 without a findall/3 under the hood.

Another way out would be an improved library(aggregate).

emiruz · December 23, 2023, 9:22am

Would you mind providing a short example how I could achieve a count of the unique final nodes using mode directed tabling?

An alternative is to just not use backtracking by having cuts all over the place and manage the stack myself, but that seems to defy the point of using Prolog…

j4n_bur53 · December 23, 2023, 9:25am

Don’t know exactly. But you can use “sum” mode with value
1 to archive counting? And a declaration like this here:

:- table p(_, sum)

Gives you a grouping on the first argument, of the sums. It basically
makes p/2 behave like, except that the result is memoized as well:

p(X,Y) :- aggregate(sum(Z), q(X,Z), Y).

Where q/2 would be the definition of p/2.

Edit 23.12.2023
aggregate/3 is very much designed after SQL, although one can
understand it in terms of bagof/3. But aggregate/3 shares a property
of SQL, namely that the non aggregate result variables work as “groupers”:

SELECT ProductName, SUM(Price)
FROM Sales
GROUP BY ProductName

Isn’t the GROUP BY clause in SQL optional, wouldn’t a ProductName grouping
be the default if it appears in the SELECT. Don’t remember exactly.

emiruz · December 23, 2023, 9:30am

Ok, just managing the stack myself looks neater to be honest, so I’ll just do that. I also strongly suspect that tabling will explode the memory too.

j4n_bur53 · December 23, 2023, 9:31am

No, because it doesn’t table q/2:

mode directed tabling tables after the aggregate, it will table p/2,
not before the aggregate, it will not table q/2. You need to add a
predicate by some new name, and not place the table/1 directive

before walk/3. But if you use a table directive for q/2, it will
probably explode. But this is not what I was suggesting.

emiruz · December 23, 2023, 9:33am

How would I do that with the walk predicate above to count unique final states? A final state is a function of the path that brought it to the final state. How would I avoid tabling paths in order not to explode the memory?

j4n_bur53 · December 23, 2023, 9:36am

Whats your start value?

:- table walk_grouped_and_counted(_, sum).

walk_grouped_and_counted(Final, 1) :- walk(64, [start], Final).

?- walk_grouped_and_counted(Final, Count).

Make walk_grouped_and_counted/2 with a larger arity, and pass some parameters?

Edit 23.12.2023
Works fine, here a small test case:

walk(64, [start], stop1) :- between(1,1000000,_).
walk(64, [start], stop2) :- between(1,500000,_).
walk(64, [start], stop1) :- between(1,500000,_).

I then get:

?- walk_grouped_and_counted(Final, Count).
Final = stop2,
Count = 500000 ;
Final = stop1,
Count = 1500000.

Didn’t give me a Prolog overflow. But maybe I didn’t try hard enough?

emiruz · December 23, 2023, 9:41am

Marvelous, thank you. I’ll try it.

Topic		Replies	Views
Memory management and garbage_collect_clauses/0 General	6	90	October 27, 2024
Memory management Looking for Group	4	537	April 8, 2022
Retract and freeing memory Help!	1	329	January 21, 2020
Stack size limit with findall Help!	5	1656	August 31, 2020
Memory use for database Help!	16	913	June 19, 2022

Memory accumulation on backtracking

Related topics