Thank you for sharing links to your work.
I am skeptical that it’s possible to do Resolution with free-form text, such as that generated by a language model (large or small!) so I had a look at your paper ([2306.14077] Full Automation of Goal-driven LLM Dialog Threads with And-Or Recursors and Refiner Oracles) to see how you did it. I have to say that I remain skeptical, in particular of the claim that your approach is more than superficially similar to, or perhaps inspired by, SLD-Resolution. For example, your paper claims that:
“Our approach will follow closely the SLD-resolution algorithm for pure Horn Clause logic [13,12].”
I hope that you will welcome my criticism. If so, here are the reasons for my skepticism.
Perhaps the most important reason for my skepticism is that Resolution, including SLD-Resolution, is known to be sound, and refutation-complete, or complete with subsumption. We know of no other inference system with this property that consists of a single rule and is therefore “machine-oriented” (i.e. computable by a simple algorithm). I cannot see where your approach follows “closely the SLD-Resolution algorithm”, and where it is that it provides such guarantees.
To begin by finding some common ground, let’s clarify that “SLD” Resolution means Linear Resolution, restricted to Definite clauses and with a Selection rule. Linear means that resolvents are always a goal derived in the last resolution step, and a definite program or unit clause from the background logic program. A “Selection rule” selects the next literal to resolve upon. In SLD-Resolution as implemented in Prolog the selection rule is that literals are resolved from left to right, from the head to the body, of a clause.
Resolution of course is the Resolution rule, as described by J. Alan Robinson: given the conjunction of two clauses ¬φ ∨ χ ∧ φ ∨ ψ, eliminate the contradiction ¬φ ∧ φ and derive a new clause χ ∨ ψ.
It seems that your work eliminates each of those assumptions:
-
The Resolution rule is nowhere to be found. It seems replaced by “LLM-driven dynamic clause head creation” (page 2, Introduction).
-
Definite clauses are eliminated as “the underlying logic is propositional” (idem).
-
The selection rule is eliminated: “SLD-resolution’s clause selection via unification is replaced by LLM-driven dynamic clause head creation” (idem).
-
There is no mention of the Linear character of SLD-Resolution.
Careful reading of Sections 4.1 (“The recursive descent”) and 4.5 (“The Logic Programming Connection”) further confirms the above observations. The text claims that the effect of the OR and AND prompters in expanding a head h (not defined earlier in the text) “can be described” as sets of Horn clauses. But, why is that the case?
I also do not understand the following claim (from the abstract):
- At the end, the unique minimal model of a generated Horn Clause program collects the results of the reasoning process.
A “unique minimal model of a generated Horn Clause program” must refer to the Least Hearbrand Model (LHM) of a logic program, i.e. if Π is a logic program, the set of (logical) atoms A such that Π ∪ {→A} ⊢ □. But, instead of such a set of atoms, your system is said to return “traces of justification steps clearly explaining where they are derived from”. Where, then, is the LHM generated?
Perusal of the (few) empirical results in the Appendix reveals that the atoms derived are sentences in natural language surrounded by single-quotes, thus capable of being parsed as Prolog “facts” by a Prolog interpreter, for example:
‘Symbolic representation learning’.
‘Knowledge-based systems’.
‘Automated theorem proving’.
‘First-order logic inference’.
These are, I suppose, propositions, so they can be seen as ground atoms in the Herbrand base of a predicate- but, which predicate is that? What is achieved by “atomising” natural language sentences like that?
I suspect these are criticisms that have been addressed at your work before and I’m curious what are the responses you have formulated.