Hello all,
I have finally managed to cleanup an implementation of E-Graph, this time using prolog variables for class ids.
For a bit of context about egraph, I have recently made another topic explaining that, although that approach was a dead-end.
Implementation
The idea behind this new implementation is to represent an E-Graph as an ordset of pairs, where each pair is a Node-ClassId.
A Node is either:
- a prolog non-compound term (atom, number, string, variable).
- or a compound term with its argument replaced with the class id of the arguments like this:
A+BwhereAwould be the class id of the first argument, etc
Usage
To use the e-graph, you would use it in three steps:
add_term(Term, Id): convert your prolog term into an e-graph (as an ord_list of pairs as explained above). Don’t loose thatId, it will become the rewritten term.saturate(Rules, Budget): saturate the e-graph given a set of rewrite rules. The budget is used to limit saturation iterations.extract: extract the lowest cost term. TheIdproduced inadd_termwill be instantiated with the rewritten term
These three predicates are actually DCG predicates, so that you can thread the egraph through easily.
Rewrite Rules
I have currently implemented the following rewrite rules for addition and multiplication:
- comm:
A+B => B+A,A*B => B*A - assoc:
A+(B+C) => (A+B)+C,A*(B*C) => (A*B)*C - reduce:
A+0 => A,A*1 => A,A*0 => 0 - factorize:
A+A => A*2,A+B*A => A*(B+1) - constant_folding:
1+2 => 3,1*2 => 2for any constants
Cost
I have implemented a naive cost system where:
- a non-compound term has a cost of 1
- compound term has a cost of the sum of its argument cost + 1
- cost of a factorized node is slightly lower (0.9) to always prefer it from its non-factorized form
Example
% with this, we can create an egraph from a prolog term, saturate and extract
% the lowest cost term
example3(N, Expr, R) :-
distinct(R, phrase((
add_term(Expr, R),
saturate([comm, assoc, factorize1, factorize2, constant_folding], N),
saturate([reduce], N), % reduce with comm+assoc+factorize explodes
extract
), [], _)).
% commutativity + associativity
?- egraph:example3(inf, a+b+c, R).
R = a+b+c ;
R = b+a+c ;
...
% Constant folding
?- egraph:example3(inf, 1+2, R).
R = 3 ;
R = 1+2 .
% factorization
?- egraph:example3(inf, a+a+b+2*a, R).
R = b+4*a .
% reduction
?- egraph:example3(inf, a+0, R).
R = a ;
R = 0+R ;
R = R+0.
Performances
I can do full equality saturation for 10 additions with commutativity + association rewrite rules in ~30 seconds.
The blog post on doing egraph with chr reports doing 6 additions in 40 seconds and can’t do 7 and says that a rust implementation of egraph, egglog, can do 10 in less than a second.
So, it is better than the chr implementation, but still far from the rust implementation.
Although this is still highly experimental and buggy, please try and test it and let me know what you think.
If anybody express some interests, I’ll make a pack out of it.
Notes
The design choice of using prolog variables for class ids makes congruence closure much simpler.
However, the implementation of rewrite rules is a real pain.
For this implementation to become practical, I believe we need a rewrite rule compiler to make writing rewrite rules much simpler…