More efficient group_by/4 for SWI-Prolog?

Just noticed that group_by/4 calculates variables and then
delegates to bagof/3. But the later predicate calculates also
varables, so I suspect quite an overhead:

/* SWI-Prolog 9.3.19 */
group_by(By, Template, Goal, Bag) :-
    ordered_term_variables(Goal, GVars),
    ordered_term_variables(By+Template, UVars),
    ord_subtract(GVars, UVars, ExVars),
    bagof(Template, ExVars^Goal, Bag).

I went with another soluton. First I provided a variant of aggregate/3
by the name aggregate_by/4 where one can offload the internal
term_variables/2 calculation. Then use this bootstrapping:

/* Dogelog Player 1.3.0 */
group_by(Witness, Template, Goal, List) :-
   aggregate_by(Witness, bag(Template), Goal, List).

Here is some testing:

/* SWI-Prolog 9.3.19 */
?- length(_H,4000), time((between(1,2000,_), 
        group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L), 
        fail; true)).
% 1,153,998 inferences, 0.562 CPU in 0.568 seconds (99% CPU, 2051552 Lips)
true.
?- length(_H,8000), time((between(1,2000,_), 
        group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L), 
        fail; true)).
% 1,153,998 inferences, 1.047 CPU in 1.060 seconds (99% CPU, 1102326 Lips)
true.

/* Dogelog Player 1.3.0 */
?- length(_H,4000), time((between(1,2000,_), 
        group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L), 
        fail; true)).
% Zeit 399 ms, GC 0 ms, Lips 16987636, Uhr 10.02.2025 10:49
true.
?- length(_H,8000), time((between(1,2000,_), 
        group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L), 
        fail; true)).
% Zeit 400 ms, GC 1 ms, Lips 16945167, Uhr 10.02.2025 10:50
true.

The old version suffers from some term_variables/2 dependency
whereas the new version is totally immune on the size of the
given goal, since any internal term_variables/2 has been offloaded.

I couldn’t name aggregate_by/4 as aggregate/4, since the later
already exists in SWI-Prolog and SICStus Prolog and has a different
semantics, it is not the analog of distinct/2, where one can specify Witnesses.

I was exploring group_by/4 respectively a new aggregate_by/4 for
some machine learning statistics. But the slowdown is not that
aggravated if the extra parameter _H is ground. The cost is smaller

factor then. So I had a change of mind in favor of more declarative
aggregate/3. Too much new predicates isn’t healthy so I dismissed
the idea of supporting distinct/2, group_by/4 and a new aggregate_by/4.

But somehow I fell in love with the idea of a new firstof/2 predicate,
instead of distinct/2, it could be bootstrapped as follows:

firstof(X, Q) :-
   bagof(X, Q, L),
   L = [X|_].

Except it can be implemented like distinct/2 more eagerly
fitting to some other predicates from the library(sequence):

p(1,a).
p(1,b).
p(2,c).
p(2,d).
p(2,e).

?- firstof(Y,p(X,Y)).
Y = a, X = 1;
Y = c, X = 2;
fail.

You are slowly moving in the general direction of SQL window functions. The syntax in SQL is absurd but the features are useful.

SQL window functions are sometimes a short hand
for subexpressions that are SQL queries. For example
this here with SQL window functions:

SELECT depname, empno, salary, avg(salary) OVER 
(PARTITION BY depname) FROM empsalary;

Is the same as this here without SQL window functions:

SELECT depname, empno, salary, 
       (SELECT AVG(salary) FROM empsalary e2 WHERE 
e2.depname = e1.depname) AS avg_salary
FROM empsalary e1;

There is no GROUP BY in the SQL solution, and a SWI-Prolog
translation does neither need group_by/4, you can do it with the
de facto library(aggregate) in Prolog:

?- empsalary(Depname, Empno, Salary),
    aggregate_all((count,sum(_Salary)), 
        empsalary(Depname, _, _Salary),
                      (_Count, _Sum)),
    Avg_Salary is _Sum/_Count.

So I rolled back my group_by/4 introduction, and instead
I introduced something else in my Prolog system. Namely
marked variables in queries and in Prolog text consulting.

So that one can emphasize temporary variables such as
_Count and _Sum, and they don’t clutter answer substitutions.
Marked variables do not yet work in Scryer Prolog ,

but they work in Trealla Prolog as well.

Things are totally different in SQL than in Prolog systems, since
SQL has sorted indexes, already implied in CREATE INDEX, you
have to opt-out via USING hash . But back to Prolog systems,

you can try this test data:

empsalary(develop, 11, 5200).
empsalary(develop, 7, 4200).
empsalary(develop, 9, 4500).
empsalary(develop, 8, 6000).
empsalary(develop, 10, 5200).
empsalary(personnel, 5, 3500).
empsalary(personnel, 2, 3900).
empsalary(sales, 3, 4800).
empsalary(sales, 1, 5000).
empsalary(sales, 4, 4800).

You can check the query:

But you can also write another solution where you first do a
aggregate/3 and then empsalary/3, like for example. But a Prolog
system does usually not go into the pain of doing the aggregate/3

eagerly window style, like using some Depname first argument
indexing. Eagerly window style does already not work since by
definition aggregate/3 is bootstrapped from bagof/3 so it also

involves keysorting and will also sort Depname. But Prolog first
argument indexing is hardly sorted in most Prolog systems. So
aggregate/3 will have all Depnames in memory and not one by one:

?- aggregate((count,sum(_Salary)), 
        _Empno^empsalary(Depname, _Empno, _Salary),
                      (_Count, _Sum)),
    Avg_Salary is _Sum/_Count,
    empsalary(Depname, Empno, Salary).

I really start likeing marked variables, but I am little bit
overdoing it. If the top-level supresses unbound variables,
you don’t need to mark _Salary and _Empno. But _Empno

is quantified by (^)/2, which you cannot do with an annoynmous
variable, so marked variables are quite handy here. The
alternative solution with my suggested aggregate_by/4 can

be easily derived, but it doesn’t give something more readable,
except that the 1st argument aggregate_by/4 could be read off
from a corresponding PARTITION BY SQL clause.

But even with aggregate_by/4 the 2nd empsalary/3 call would
still be outside the aggregate_by/4 call, since the conceived
aggregate_by/4 DOES NOT give a full OVER SQL clause.