Counting actual prolog lines loaded

Boris · July 19, 2021, 1:16pm

(I hope I didn’t misunderstand the conversation)

The aggregate predicate family is useful but at the moment not generic enough. The obvious to me solution to single-pass aggregation (very large files, streams) is as follows:

Create a predicate that backtracks over the solutions;
Use a non-backtrackable data structure for aggregation.

For the second one, I have used (nb)_rbtrees with this tiny bit of code:

nbdict_apply(X, Key, Pred, Init) :-
    (   nb_rb_get_node(X, Key, Node)
    ->  nb_rb_node_value(Node, Val0),
        call(Pred, Val0, Val1),
        nb_rb_set_node_value(Node, Val1)
    ;   nb_rb_insert(X, Key, Init)
    ).

This will either insert a default Initial value or apply the Predicate to the existing value associated with the Key. This post by Jan W discusses the computational complexity. It also suggests an obvious and better way to count word frequency that does not work if your input is big enough. I guess your question refers in part to that post?

If you do it like this, you can simply use a forall/2. If we assume you have defined file_word/2 predicate that succeeds once for every word in a file, you would do:

rb_empty(Freqs),
forall(file_word(File, Word),
    nbdict_apply(Freqs, Word, succ, 1)),
forall(rb_in(Key, Val, Freqs),
    format("~w ~w~n", [Key, Val]))

But this is still not optimal, you need to hand-roll both the backtracking predicate and the non-backtrackable accumulator for anything non-trivial. Do you have a better idea?

Topic		Replies	Views
Lines of code in modules -- without comments and with conditional compile applied Help!	4	366	March 17, 2021
How-to line_count Help! how-to	7	650	April 17, 2020
Line_count/2 and phrase_from_file Help!	3	794	May 28, 2019
Load generated code Help!	1	354	March 22, 2019
Changed behaviour of load_files? Help!	5	347	April 9, 2019

Counting actual prolog lines loaded

Related topics