This is a very long post.
I am implementing a Prolog program whose purpose is proving that a given algebraic expression hold. A problem input to the program is an algebraic expression together with a set of constraints between the variables in the algebraic expression. For example, the following – in non Prolog format – is a description of the problem:
Prove that: a+b=(1/2)*c knowing that: e=a+d=(1/2)c, b=d=f.
The proof could be:
a+b = e-d+b [because e=a+d]
a+b = e-d+f [because b=f]
a+b = (1/2)*c-d+f [because e=(1/2)*c]
a+b = (1/2)*c-d+d [because d=f]
a+b = (1/2)*c [because forall x in R: x-x=0]
Showing proof work is important, and therefore using a global technique such as constraint-logic programming over real numbers is not appropriate in this case.
I have worked out something that produces a proof similar to the above, but I am facing difficulties.
Mainly due to the lack of a solid Prolog background, I am using a rather cumbersome encoding for the terms in the algebraic expression. For example, the term
a is represented as the list
[Sign, Coefficient,EntityName]), the term
(1/2)c is represented as
[+,0.5,c], and the algebraic expression
a+b=(1/2)*c is represented as the list
[ [ [+,1,a], [+,1,b] ], [ [+,0.5,c] ] ] whose first element is a list of terms in the LHS of the algebraic expression, and second element is a list of terms in the RHS of the algebraic expression. So far I only consider linear algebraic expressions.
The constraints are represented as predicates
constraint([ [+,1,e] ], [ [+,1,a], [+,1,d] ], ['[Reason: Given]']). % for e=a+d
constraint([ [+,1,a] ], [ [+,1,e], [-,1,d] ], ['[Reason: Given]']). % for e=a+d
constraint([ [+,1,d] ], [ [+,1,e], [-,1,a] ], ['[Reason: Given]']). % for e=a+d
constraint([ [+,1,e] ], [ [+,0.5,c] ], ['Given']). % for e=(1/2)c
… and so on
where arguments 1 and 2 specify the LHS and RHS of the constraint expression, and argument 3 shows the reason for why the equality holds, in this case, it is given. Constraints are not always given, as they can be fed from a previous work performed by another Prolog program.
Then, I have
substitute_term_in_equation/4 predicate that manipulates the LHS to produce the RHS either by replacing the term in the expression by one from
constraint/3 or by performing simplifications e.g. the following call to
substitute_term_in_equation/4 modifies the expression
substitute_term_in_equation(LeftHandSide, RightHandSide, Output_In, Output_Out):- member(Term1, LeftHandSide), Term1 = [+,A,Entity], member(Term2, RightHandSide),Term2 = [-,B,Entity], NewFactor is A + B, ( ( NewFactor < 0, NewTerm = [-, NewFactor, Entity] ); ( NewFactor >= 0, NewTerm = [+, NewFactor, Entity] ) ), delete(RightHandSide, Term2, NewRightHandSide), delete(LeftHandSide, Term1, LeftHandSideTemp), append(LeftHandSideTemp, [NewTerm], NewLeftHandSide), StepOutput = [NewLeftHandSide, ' = ', NewRightHandSize, '\n\t\t[Reason: Adding ', Term1, ' to both sides of the equation]', nl], append(StepOutput, Output_In, Output_Interim), substitute_term_in_equation(NewLeftHandSide, NewRightHandSize, Output_Interim, Output_Out).
I have a limited subset of algebraic-expression manipulations that includes dividing all terms in LHS and RHS by a constant, removing terms that cancel each other, etc. However, I do not know if my set is sufficient to handle all algebraic expression proofs; and I have no idea how to prove that.
The output variables allow me to “backtrack” over output during proof. At the end of the program I will write the content of the final
Output_Out that contains the ordered proof steps.
One major issue I suffer from is that algebraic expressions can grow very large especially that I use multiple clauses for certain constraints. For example, the constraint
e=a+d is represented by:
constraint([ [+,1,e] ], [ [+,1,a], [+,1,d] ], ['Given']). % for e=a+d
constraint([ [+,1,a] ], [ [+,1,e], [-,1,d] ], ['Given']). % for e=a+d
constraint([ [+,1,d] ], [ [+,1,e], [-,1,a] ], ['Given']). % for e=a+d
which could lead to an infinite expansion:
e = a+d = (e-d)+d = ((a+d)-d)+d = ...
My hack so far is to only allow one clause of a given constraint to be used i.e. once one clause is used to substitute the term in an expression, the other clauses that represent the same constraint are never used.
This is my main topic: I don’t like this as it no longer feels like logic programming and also it probably will fail at some point if more than one clause that represent the constraint is needed in the proof. Is there a more natural way of doing this kind of algebra where proof steps are needed to be output to the user?