Improving Beckert's & Posegga's in Prolog

Joseph Vidal Rosset expressed some regret that his Prolog capabilities are limited. Maybe we can pool on others Prolog programming capabilities. So here is the callenge, provide Beckert & Posegga in Prolog.

Beckert & Posegga is usually pitched as being the shortest first order logic prover around. But there is a catch, namely it needs a preparation step, the formula needs to be braught in a special form before it can be tried by the prover.

So in the end the prover will not only be the epic few lines from here, but a little more:
Unbenannt
https://formal.iti.kit.edu/beckert/pub/LeanTAP.pdf

Anybody up to the challenge, including running the Pelletier’s problems.
And how about display found proofs?

Edit 10.09.2020:
Cross posted on Stackoverflow:
https://stackoverflow.com/q/63798877/502187

3 Likes

Hello,
At the moment, the best performances of a Prolog prover for Classical Propositional Logic are given by @j4n_bur53’s prover below:

:- op( 500, fy, ~).     % negation
:- op(1000, xfy, &).    % conjunction
:- op(1100, xfy, '|').  % disjunction
:- op(1110, xfy, =>).   % conditional
:- op(1120, xfy, <=>).  % biconditional
:- op( 500, fy, !).     % universal quantifier:  ![X]:
:- op( 500, fy, ?).     % existential quantifier:  ?[X]:
:- op( 500,xfy, :).



%--------------------------------------------------------------

prove0(F, P) :- time(prove([]>[F],P)).

provable(F,W):-
      time(prove([]>[F],P)),
      term_lower_upper(P,J),
       nl,
       write('Prolog output : '), write(P),
       nl,
       nl,
       latex(J,W),!.
% -----------------------------------------------------------------


prove(G>[(A=>B)|D], rcond(G>[(A=>B)|D],P)):- !,
        prove(G>[~A,B|D],P).
prove(G>[(A|B)|D], ror(G>[(A|B)|D], P)):- !,
        prove(G>[A,B|D],P).
/* double negation */
prove(G>[~ ~A|D], rneg(G>[~ ~A|D],P)):- !,
        prove(G>[A|D],P).

prove(G>[~ (A&B)|D], land(G>[~ (A&B)|D],P)):- !,
        prove(G>[~A,~B|D],P).

prove(G>[(A<=>B)|D], rbicond(G>[(A<=>B)|D],P1,P2)):- !,
        prove(G>[~A,B|D],P1),
        prove(G>[A,~B|D], P2).
prove(G>[~ (A<=>B)|D], lbicond(G>[~ (A<=>B)|D], P1,P2)):- !,
        prove(G>[~A,~B|D],P1),
        prove(G>[A,B|D],P2).
prove(G>[~ (A=>B)|D], lcond(G>[~ (A=>B)|D],P1,P2)):- !,
        prove(G>[A|D],P1),
        prove(G>[~B|D],P2).
prove(G>[(A&B)|D], rand(G>[(A&B)|D],P1,P2)):- !,
        prove(G>[A|D],P1),
        prove(G>[B|D],P2).
prove(G>[~ (A|B)|D], lor(G>[~ (A|B)|D], P1,P2)):- !,
        prove(G>[~A|D],P1),
        prove(G>[~B|D],P2).

prove(G>[A|D], ax(G>[A|D], A)):-
        member(B,G), A==B, !.

/* next */

prove(G>[~A|D], lneg(G>[~A|D], P)) :- !,
        prove([A|G]>D,P).

prove(G>[A|D], lneg(G>[A|D], P)) :- !,
        prove([~A|G]>D,P).

prove(G>[], asq(G>[], asq)):-
         write("Unprovable formula ! ").

member(E, [E|_]).
member(E, [_|Xs]) :-
        member(E, Xs).

term_lower_upper(Lower,Upper) :-
    (   compound(Lower)
    ->  mapargs(term_lower_upper,Lower,Upper)
    ;   is_list(Lower)
    ->  maplist(term_lower_upper,Lower,Upper)
    ;   var(Lower)
    ->  Upper = Lower
    ;   atomic(Lower) % this should be useless
    ->  upcase_atom(Lower,Upper)
    ;   throw(term_lower_upper(Lower,Upper))
    ).



/******************************************************************/
/* Pretty Printing                                                */
/******************************************************************/


atom_split(X, D, L) :-
    atomic_list_concat(L, D, X).

latex(H, J) :-
    latex_root(H, J, L, []),
    atom_split(Y, '', L),
    write(Y),nl.

latex_root(P, J) --> ['\\begin{prooftree}'], latex_proof(P, J), ['\\end{prooftree}'].

latex_proof(ax(S,U), J) --> ['\\RightLabel{$Ax.$} \\AxiomC{} \\UnaryInfC{'],
   latex_sequent(S, ax(S,U), J), ['}'].
latex_proof(rcond(S,P), J) --> latex_proof(P, J),
   ['\\RightLabel{$R\\to$} \\UnaryInfC{'],
   latex_sequent(S, rcond(S,P), J), ['}'].
latex_proof(land(S,P), J) --> latex_proof(P, J),
   ['\\RightLabel{$L\\land$} \\UnaryInfC{'],
   latex_sequent(S, land(S,P), J), ['}'].
latex_proof(ror(S,P), J) --> latex_proof(P, J),
   ['\\RightLabel{$R\\lor$} \\UnaryInfC{'],
   latex_sequent(S, ror(S,P), J), ['}'].
latex_proof(lneg(S,P), J) --> latex_proof(P, J),
   ['\\RightLabel{$L\\neg$} \\UnaryInfC{'],
   latex_sequent(S, lneg(S,P), J), ['}'].
latex_proof(rneg(S,P), J) --> latex_proof(P, J),
   ['\\RightLabel{$R\\neg$} \\UnaryInfC{'],
   latex_sequent(S, rneg(S,P), J), ['}'].
latex_proof(rand(S,P,Q), J) --> latex_proof(P, J), latex_proof(Q, J),
   ['\\RightLabel{$R\\land$} \\BinaryInfC{'],
   latex_sequent(S, rand(S,P,Q), J), ['}'].
latex_proof(lor(S,P,Q), J) --> latex_proof(P, J), latex_proof(Q, J),
   ['\\RightLabel{$L\\lor$} \\BinaryInfC{'],
   latex_sequent(S, lor(S,P,Q), J), ['}'].
latex_proof(lcond(S,P,Q), J) --> latex_proof(P, J), latex_proof(Q, J),
   ['\\RightLabel{$L\\to$} \\BinaryInfC{'],
   latex_sequent(S, lcond(S,P,Q), J), ['}'].
latex_proof(lbicond(S,P,Q), J) --> latex_proof(P, J), latex_proof(Q, J),
   ['\\RightLabel{$R\\leftrightarrow$} \\BinaryInfC{'],
   latex_sequent(S, lbicond(S,P,Q), J), ['}'].
latex_proof(rbicond(S,P,Q), J) --> latex_proof(P, J), latex_proof(Q, J),
   ['\\RightLabel{$R\\leftrightarrow$} \\BinaryInfC{'],
   latex_sequent(S, rbicond(S,P,Q), J), ['}'].
latex_proof(asq(S,U), J) --> ['\\RightLabel{$Asq.$} \\AxiomC{} \\UnaryInfC{'],
   latex_antisequent(S, asq(S,U), J), ['}'].




latex_sequent(G > D, P, J) --> ['$ '], latex_list(G, P, left, 0, J), [' \\vdash '], latex_list(D, P, right, 0, J), [' $'].

latex_antisequent(G > D, P, J) --> ['$ '], latex_list(G, P, left, 0, J), [' \\nvdash '], latex_list(D, P, right, 0, J), [' $'].

latex_list([], _, _, _, _) --> [].
% latex_list([X|L], P, F, N, J) --> {has_usage(P, F, N)}, !, latex_formula(X, J), {M is N+1}, latex_rest(L, P, F, M, J).
latex_list([X|L], P, F, N, J) --> latex_formula(X, J), {M is N+1}, latex_rest(L, P, F, M, J).
latex_list([_|L], P, F, N, J) --> {M is N+1}, latex_list(L, P, F, M, J).

latex_rest([], _, _, _, _) --> [].
% latex_rest([X|L], P, F, N, J) --> {has_usage(P, F, N)}, !, [', '], latex_formula(X, J), {M is N+1}, latex_rest(L, P, F, M, J).
latex_rest([X|L], P, F, N, J) --> [', '], latex_formula(X, J), {M is N+1}, latex_rest(L, P, F, M, J).
latex_rest([_|L], P, F, N, J) --> {M is N+1}, latex_rest(L, P, F, M, J).

latex_formula(~A, J) --> !, ['\\neg '], latex_term(A, J).
latex_formula((A&B), J) --> !, latex_term(A, J), [' \\land '], latex_term(B, J).
latex_formula((A|B), J) --> !, latex_term(A, J), [' \\lor '], latex_term(B, J).
latex_formula((A=>B), J) --> !, latex_term(A, J), [' \\to '], latex_term(B, J).
latex_formula((A<=>B), J) --> !, latex_term(A, J), [' \\leftrightarrow '], latex_term(B, J).
% latex_formula(![X], J) --> !, ['\\forall '], latex_term(X, J).
% latex_formula(?[Y], J) --> !, ['\\exists '], latex_term(Y, J).
latex_formula(X, J) --> latex_factor(X, J).


latex_term(~A, J) --> !, ['\\neg '], latex_term(A, J).
latex_term((A&B), J) --> !, ['('], latex_term(A, J), [' \\land '], latex_term(B, J), [')'].
latex_term((A|B), J) --> !, ['('], latex_term(A, J), [' \\lor '], latex_term(B, J), [')'].
latex_term((A=>B), J) --> !, ['('], latex_term(A, J), [' \\to '], latex_term(B, J), [')'].
latex_term((A<=>B), J) --> !, ['('], latex_term(A, J), [' \\leftrightarrow '], latex_term(B, J), [')'].
% latex_term(![X], J) --> !, ['\\forall '], latex_term(X, J).
% latex_term(?[Y], J) --> !, ['\\exists '], latex_term(Y, J).
latex_term(X, J) --> latex_factor(X, J).


latex_factor(X, J) --> {var(X), member(W = N, J), W == X}, !, [N].
latex_factor(X, _) --> {var(X)}, !, ['?'].
latex_factor(X, J) --> {X=..[F,Y|L]}, !, [F, '('], latex_factor(Y, J), latex_args(L, J), [')'].
latex_factor(X, _) --> [X].

latex_args([], _) --> [].
latex_args([X|L], J) --> [','], latex_factor(X, J), latex_args(L, J).

For Pelletier Problem 71, here is the result:

 % 8,814,591 inferences, 1.321 CPU in 1.321 seconds (100% CPU, 6671014 Lips)

Note that prove0 must be used and not the provable query which provides LaTeX output.
leanTAP does not succeed to prove this formula.

@j4n_bur53 : what is the name of this prover?

I am convinced. Thanks !

For Pelletier Problem 71, my zdd based resolution prover takes
30,772 inferences, 0.002 CPU in 0.002 seconds (100% CPU, 14528801 Lips).

I may be missing something as often I am.

valid_formula( (a * b) + (a * -b) + (-a * b) + (-a * -b)).
valid_formula((-(-a) -> a) * (a -> -(-a))).		% Double Negation
valid_formula( ((a * b) -> -((-a + -b))) * (-((-a + -b)) -> (a * b))). % De Morgan’s Law
valid_formula( (-b -> f) * ((b *f) -> -i) * ((i + -b) -> -f) -> b).
valid_formula(a + -a).
valid_formula(((a -> b) -> a) -> a).  % Peirce’s Law
valid_formula(-(-a)->a).
valid_formula(a -> a).
valid_formula(a -> (b -> a)).
valid_formula((a->b)->((b->c)->(a->c))).
valid_formula(a->((a->b)->b)).
valid_formula((a*(a->b))->b).
valid_formula((a->b)->(-b -> -a)).
valid_formula((a->b)->((b->c)->(a->c))).
valid_formula((((((((((((((((((((((((((((p13 <=> p12) <=> p11) <=> p10)
<=> p9) <=> p8) <=> p7) <=> p6) <=> p5) <=> p4) <=> p3)
<=> p2) <=> p1) <=> p0) <=> p13) <=> p12) <=> p11) <=> p10)
<=> p9) <=> p8) <=> p7) <=> p6) <=> p5) <=> p4) <=> p3)
<=> p2) <=> p1) <=> p0)).

% ?- time(forall(valid_formula(A), zdd((writeln(A), prop_neg_refute(A))))).

%@ a*b+a* -b+ -a*b+ -a* -b
%@ % 228 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 7354839 Lips)
%@ VALID
%@ (- -a->a)*(a-> - -a)
%@ % 22 inferences, 0.000 CPU in 0.000 seconds (76% CPU, 2200000 Lips)
%@ VALID
%@ (a*b-> - (-a+ -b))*(- (-a+ -b)->a*b)
%@ % 94 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 6266667 Lips)
%@ VALID
%@ (-b->f)*(b*f-> -i)*(i+ -b-> -f)->b
%@ % 116 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 6444444 Lips)
%@ VALID
%@ a+ -a
%@ % 22 inferences, 0.000 CPU in 0.000 seconds (74% CPU, 2444444 Lips)
%@ VALID
%@ ((a->b)->a)->a
%@ % 22 inferences, 0.000 CPU in 0.000 seconds (75% CPU, 2444444 Lips)
%@ VALID
%@ - -a->a
%@ % 22 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 956522 Lips)
%@ VALID
%@ a->a
%@ % 22 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 3142857 Lips)
%@ VALID
%@ a->b->a
%@ % 26 inferences, 0.000 CPU in 0.000 seconds (70% CPU, 3714286 Lips)
%@ VALID
%@ (a->b)->(b->c)->a->c
%@ % 172 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 10117647 Lips)
%@ VALID
%@ a->(a->b)->b
%@ % 124 inferences, 0.000 CPU in 0.000 seconds (84% CPU, 8266667 Lips)
%@ VALID
%@ a*(a->b)->b
%@ % 95 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 7307692 Lips)
%@ VALID
%@ (a->b)-> -b-> -a
%@ % 95 inferences, 0.000 CPU in 0.000 seconds (77% CPU, 6785714 Lips)
%@ VALID
%@ (a->b)->(b->c)->a->c
%@ % 172 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 9555556 Lips)
%@ VALID
%@ <=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(<=>(p13,p12),p11),p10),p9),p8),p7),p6),p5),p4),p3),p2),p1),p0),p13),p12),p11),p10),p9),p8),p7),p6),p5),p4),p3),p2),p1),p0)
%@ % 30,772 inferences, 0.002 CPU in 0.002 seconds (100% CPU, 14528801 Lips)
%@ VALID
%@ % 1,557,078 inferences, 0.112 CPU in 0.122 seconds (92% CPU, 13860283 Lips)
%@ true.

There are full FOL provers which do not use of clauses of literals but directly in a sense similarly LK keeping input formulas. Interesting. I heard such research trend but did not pay attention. Now it may be time for me to read papers. Thanks.

(Nothing of Relevance here)

[quote=“j4n_bur53, post:21, topic:2894”]
I doubt that your prover is “resolution” based. Also you didn’t publish any sources.
[/quote

I mean by resolution such an operation on clauses (without unification.)

[a, -b, c], [d, b, e] => [a, c, d, e]

An old source is in my package pac library(pac/zdd.pl) located at the end, which did not use yet term_hash for preprocessing input prop formula into clauses to be refuted. I wrote the resolution based prover just as a sample test codes on the ZDD pac basic operations of mine on sets of clauses of literals. Although I did not intend to invent the “wheel”, my impression was ZDD in general (not mine) is already efficient resolution prover in nature for the ground literals without any extra works. IMHO, the nature comes from its extreme structure sharing for a set of clauses of ground literals.

Sorry I am missing something, but I should say that
I had checked of course that by a simple induction on the number
of ground literals, the algorithm of mine is sound and complete with respect to the semantics of “standard” classical propositional logic, though my proof is only in my mind without pencil and paper.

I thought the word resolution is unique in meaning. I know in LK
it is called “cut.” But anyway, my use of resolution may be too narrow? I should be careful hence after. Thanks.

As ZDD is a general system which handles commands on families of sets, it depends on given kind families of sets and operations. So ZDD itself is not a prover at all. That said, from my experiences, I can not stop shouting that ZDD is might be more useful than one might think for applications which need to handle sets of clauses of literals efficiently.

I don’t remember well but vaguely via proofs on cut elimination theorems in the literature on logics.

Maybe, but I am simply saying such a following situation:

M: a computer algorithm in Prolog.
L: a proof theory ( LK or “clausal sytem”) for proposition.

It is provable on induction that M has a property that for all formula F, 1 and 2 are equivalent:

  1. M halts and output “Yes”
  2. F is derivable in L

I must say no thanks, at least, for now. But I will come back if I will read and find something interesting in leanTAP for me, thanks for the reference. More worse, in fact, it is enough for me to enjoy Prolog programming, and I like to see useful codes of others.

As sat(F(x,y)) is for finding x,y satisfying F(x,y), for me it sounds different problem from prove(forall x, y F(x,y)). i.e.,
sat(F(x,y)) seems prove(exists x,y(F(x,y))). Perhaps I am missing something here also, instead it must read
sat(not (forall x, y F(x,y)) ?