Stack overflow problem

I’m running a very complex project that generates thousands of CHR objects in addition to occasionally having quite deep Prolog calls. I’m discovering that if I run a certain process repeatedly, identical each time, it will succeed, delete all information, succeed, delete all information, stack overflow - catch exception, delete all information, succeed.

error(resource_error(stack),stack_overflow{choicepoints:529,depth:924,environments:45,globalused:523916,localused:755,stack:[frame(924,system: $collect_findall_bag(_38758826,),),frame(922,system:setup_call_catcher_cleanup((:)/2,(:)/2,_38758850,(:)/2),),frame(918,cp:class_get_item(56,qmc_plate_assembly,_38758892),),frame(917,cp:crgi([3],56,[1],_38758916,_38758918,_38758920),),frame(915,$bags:findall_loop(step/3,(:)/2,_38758954,),)],stack_limit:1048576,trailused:178046})ERROR: FAILED REQUEST (load_model(kb:project_extended))

Here’s the top-level loop, which is repeatedly receiving and executing the load_model request (and deleting the model). except/1 just converts exceptions to fail/0.

> lazy_service0 :-
> 	except(receive_term(Term))
> 	-> (except(dispatcher(Term, Response))
> 	   -> (except(send_term(Response))
> 	      -> (   Term == stop
> 		 -> fail
> 		 ; lazy_service0
> 		 )
> 	      ;	  writeln(failed_to_send(Response))
> 	      , true
> 	      )
> 	   ;   writeln('ERROR: FAILED REQUEST '(Term))
> 	   , send_term(fail)
> 	   , lazy_service0
> 	   )
> 	; true
> %        , writeln('**********client disconnect**********')
> %	, nl
> 	, true % send_term(fail)
> 	.

I don’t know what to do about this. It’s not the sort of thing I can make a small test case for because reducing the size will remove the overflow for sure.

I don’t understand why the stack would grow following repeated execution, or why a caught exception on one loop would clear it…

Maybe I can increase some stack sizes in the process?

Other suggestions?

Could it be that you are leaving behind unnecessary choice points? How do you run this code, even for a very small input? The dangling choice points will still be there, so at least in theory it should be possible to troubleshoot by getting rid of the choice points.

1 Like

Possibly you are simply close to the default limit. Depending on the history the GC/shift sequence can be different and the system may work ok one run and run out of stack the next. Try raising the limit first. Boris command on reducing memory usage are of course valid. Another useful thing is the graphical thread monitor that will show life stack usage.

1 Like

Graphical thread monitor. Did not know there was such a thing.
Found it in the menu, but failing to find any documentation on it.
Do I understand rightly that the CPU line is NOT a memory line, but the other 3 are?

I added a call to garbage_collect in between runs. The memory is now stabilized at about 30M Global usage with no model loaded, and 200M Global usage with a model loaded. It would be interesting to know what exactly is consuming all that storage.

As best I can tell there are no major cases of choicepoint leak. I’m long aware that is a performance issue and keep those on a short leash.

Didn’t know it either.

When it makes sense to you could you post back here with what you learned for the rest of us poor souls. :slight_smile:

Thanks,
Eric

SWI windows. Debug Menu. Thead Monitor. Run your program, watch the lines. As far as I know that’s all there is.

The right pane shows the running threads and there is a right menu to perform various actions on the threads.

sorry, lat to the party, but

https://www.swi-prolog.org/pldoc/doc_for?object=prolog_ide/1

or from the menus in emacs/0.