DCG optimisation and non-optimisation with multiple accumulators

peter.ludemann · October 8, 2023, 5:55pm

I modified @jan 's proposal for DCGs with multiple accumulators (Dealing with state), to use records (see library(record)) instead of dicts. Initially, this didn’t speed things up; but when I added some expansion rules to inline the record accessors, I got a nice speed-up. (My goal expansion code is at the end of this post).

However, there was one expansion that looked non-optimal:

dx -->
    value(Prev)<prv,  % Expanded inline
    { Prev \= d },
    [i]<ops.

expanded to this:

dx(#(A, B, Prev, C, D, E), #(F, G, H, I, J, K)) :-
    Prev \= d,
    #(F, [i|G], H, I, J, K)= #(A, B, Prev, C, D, E).

instead of this:

dx(#(A, [i|B], Prev, C, D, E), #(A, B, Prev, C, D, E)) :-
    Prev \= d.

This was easily fixed (I thought) by changing boot/dcg.pl line 125 from

dcg_body({T}, P0, Q, S, SR, (QT, SR = S), P) :-
    !,
    dcg_bt_pos(P0, P1),
    qualify(Q, T, P1, QT, P).

to

dcg_body({T}, P0, Q, S, SR, QT, P) :-
    !,
    dcg_bt_pos(P0, P1),
    SR = S,
    qualify(Q, T, P1, QT, P).

and it did what I expected for my code.

But it failed the unit tests:

ac --> {!}.
ac --> [_].

bx --> {\+ throw(executed)}.

which should have expanded to

ac(A, B) :-
    !,
    B=A.
ac([_|A], A).


bx(A, B) :-
    \+ throw(executed),
    B=A.

but my “optimisation” changed these to the non-steadfast:

ac(A, A) :-
    !.
ac([_|A], A).

bx(A, A) :-
    \+ throw(executed).

This lack of inline expansion of (=)/2 isn’t a problem with regular DCGs, because they generate a simple equality that the compiler can optimise away. But with the multi-accumulator expansion, the equality becomes something like

#(F, [i|G], H, I, J, K)= #(A, B, Prev, C, D, E).

The generated code constructs the two terms and unifies them, which is somewhat slower than the equivalent

F=A,
[i|G]=B,
H=Prev,
I=C,
J=D,
K=E

I have a work-around: replace value(Prev)<prv,{Prev\=d} with not_value(d)<prv (and an appropriate goal expansion) but that more-or-less requires writing a DCG version of every non-DCG goal.

So, I can see three solutions:

Make the compiler smarter in handling goals such as f(A,B)=f(C,D)
Change the DCG expansion to look inside the body for potentially dangerous items (cuts, calls, etc.) and inlining the (=)/2 only if those aren’t present
A new form of {...} (maybe {{...}}) that promises there’s nothing dangerous inside the {{...}.

Option #1 is probably the best because it would also benefit other situations. But I don’t know my way around the compiler – @jan please tell me where to look.

Goal expansion code

goal_expansion(<(On, Name,State0,State), Goal) :-
    expand_record(On, Name, State0, State, Goal).

% expand_record/2 uses record expansion instead of a dict, for faster
% performance. There must be a dcg_record_name/1 fact and an
% appropriate `:- record` directive using the same name.
expand_record(Literal,Name,State0,State, Goal), is_list(Literal) =>
    get_set_record(Name, State0, List, State, Tail),
    append(Literal, Tail, List),
    Goal = true.
expand_record(String,Name,State0,State, Goal), string(String) =>
    get_set_record(Name, State0, List, State, Tail),
    string_codes(String, Literal),
    append(Literal, Tail, List),
    Goal = true.
expand_record(value(Value),Name,State0,State, Goal) =>
    get_set_record(Name, State0, V0, State, V),
    Value = V0,
    V0 = V,
    Goal = true.
expand_record(set_value(Value),Name,State0,State, Goal) =>
    get_set_record(Name, State0, _V0, State, V),
    Goal = (V = Value).
expand_record(not_value(NotValue),Name,State0,State, Goal) =>
    get_set_record(Name, State0, V0, State, V),
    V0 = V,
    Goal = (NotValue \= V0).
expand_record(incr,Name,State0,State, Goal) =>
    get_set_record(Name, State0, V0, State, V),
    Goal = (V is V0 + 1).
expand_record(decr,Name,State0,State, Goal) =>
    get_set_record(Name, State0, V0, State, V),
    Goal = (V0 > 0, V is V0 - 1).
expand_record(Step,Name,State0,State, Goal), callable(Step) =>
    extend_goal(Step, [V0,V], StepEx),
    get_set_record(Name, State0, V0, State, V),
    Goal = StepEx.

dcg_record_name('#').

:- record '#'( % must match dcg_record_name/1
               acc,             % accumulator
               ops,             % list of opcodes
               prv,             % previous opcode
               out,             % result of running the opcodes,
               num, % the number to be output (for limiting search space)
               nsq  % see deadfish/3 (for limiting search space)
             ).

jan · October 9, 2023, 7:29am

compileBodyUnify() in pl-comp.c. As is, it only deals with Var = Something and notably avoids real unification if Var is fresh (first in the jargon of the code). You can of course also do this expansion at the Prolog side. I’m not convinced it always wins because the compiled unification is not very efficient due to the need for debugger and decompiler support.

(2) could be worth considering.

jan · October 9, 2023, 8:38am

As a more general remark, this touches on Prolog source → source transformation that should perform simple inlining, moving (notably) unifications around, etc.

On the other hand, possibly we can reach a bigger gain with the same effort with a careful look at the VM instruction set and its implementation. See results on arithmetic. There are more areas with low-hanging fruit.

peter.ludemann · October 10, 2023, 2:04am

Off the top of my head, I can think of these as potentially dangerous:

cut (!)
throw/1
call/1

So, if none of these are inside the {...}, then it would be safe to apply my optimisation?

On a related note, I took a look at compileBodyUnify() and decided that modifying it was likely more work than I want to do right now, partly because the code is in C and partly because I could just add a goal_expansion/2 that handles the particular situation … except when I tried adding a goal_expansion/2 rule, it didn’t work – it appears that the optimisation that results in the goal #(F, [i|G],H,I,J,K)=#(A,B,Prev,C,D,E) happens after all the expansion is done.

The Aquarius compiler was written in Prolog (that’s why Peter Van Roy invented the EDCG notation); I wonder if we could add a layer to SWI-Prolog that’s written in Prolog, to allow more extensive optimisations?

Topic		Replies	Views
Term_expansion, goal_expansion, and peers General discussion	3	235	January 12, 2024
Term_expansion; order of execution with DCG Help!	4	439	March 21, 2021
Multi-pass compilation? (for term expansion) Tools	14	960	April 23, 2020
How to get to a fixed point with goal expansion? Help!	18	1264	April 28, 2019
Prolog inline predicates Split Topic discussion	28	2325	February 28, 2023

DCG optimisation and non-optimisation with multiple accumulators

Related topics