Prolog in the age of generative AI

Hi Stassa,

Thanks for the careful reading of the draft paper that definitely would benefit from a few clarifications.

As TL;DR, the code is now fully deployed at https://deepllm.streamlit.app and playing with it will show clearly the generated Prolog program and its unique minimal model (i.e., Least Herbrand Model) as outputs for an initiator goal of your choice.

Let me just focus here and zooming into the SLD-inspiration claim in the paper.

The analogy with SLD-resolution becomes more obvious if you see it expressed (equivalently!) via repeated unfoldings as in:

a0:-a1,a2,...,an.     +
    b0:-b1,...,bm.
-----------------------
a0:-b1,...,bm,a2,...,an.

assuming a1 and b0 unifies, which in the propositional case is simply a1==b0 with no substitutions to be applied.

When running a propositional definite Horn Clause program, the clause

b0:-b1,...,bm.

is fetched from the Prolog database.

On the other hand, when talking to the LLM, the clause is synthesized in 2 steps:

OR-step: Given a1, build several alternative clause heads, among which, let’s look now at b0.
AND-step: Given b0, build a body of the clause made of b1,…,bm supporting b0

The trace (now switched off by default) is just the traversal of the OR and AND nodes. At the end, a result is returned as a (propositional) Horn clause program, together with its minimal model.

If you take a look at the synthesized Prolog program (e.g.,

)
you will see a combination of rules and facts, that under SLD-resolution will show the initiator goal as being solvable.

The limitation to propositional logic could be fairly easily lifted first to SVO triplets (LLMs are really good at that!) and then to be fed into an ILP system to synthesize a logic program with predicates derived from verbs and subject/object components as their arguments, to be eventually generalized to more compact rules involving variables.

But that’s the next step, and maybe someone familiar with the latest ILP tools can take it over :slight_smile:

Paul

Perusal of the (few) empirical results in the Appendix reveals that the atoms derived are sentences in natural language surrounded by single-quotes, thus capable of being parsed as Prolog “facts” by a Prolog interpreter […]

One small note: these are actually noun phrases, and not complete NL sentences, as requested in the prompt that is sent to the LLM.

Thank you for your patient reply, Paul. What you describe in terms of how your system works agrees with the understanding I formed yesterday, while reading your draft, and so my skepticism is not curbed.

As an aside, if you are planning to submit this preprint to one of the big machine learning conferences, my intuition is that you will need more empirical results. The few examples in the appendix won’t cut it (not least because they’re in an appendix!). For instance, you could try to show improved performance in one of the common language understanding benchmarks (I abhor benchmarks but you can’t get your paper published without beating a few, in machine learning). If you are aiming this at information retrieval there’s probably some similar goal you can try to achieve (I don’t know if the IR community relies on benchmarks in the same way as the machine learning one).

Perhaps. I’ll pass your paper to some of my colleagues. There is one PhD student in my group who will probably be very interested.

I personally am not interested in working with neural nets :slight_smile:

Thanks for correcting my imprecise language. I believe I wanted to say “phrases” rather than “sentences” (which is less incorrect). A mistake I often make.

It is quite interesting that you bring up modus tollens, in particular because it is, despite ChaGPT and Bard’s opinions, not just valid in classical logic but also valid in intuitionistic logic, which is a bit closer to how LLMs reason (when they actually reason :slight_smile: ).

In fact, the only LLM that gets it right when you ask “Is modus tollens a theorem in intuitionistic logic?” is Anthropic’s Claude 2 (see https://claude.ai).

Your modus tollens example can be expressed in Horn clause logic (with Prolog notation) as:

day:-light.

false:-day.

which is a Horn clause but not a definite Horn clause. The prover used in DeepLLM
(see https://github.com/ptarau/recursors/blob/0a9a30b09d1ae992cc8e75b18174ec42328897f5/deepllm/horn_prover.py ) can
happily handle that and return an empty model.

In fact, DeepLLM takes advantage of modus tollens when you invoke the Advisor refiner (see https://github.com/ptarau/recursors/blob/0a9a30b09d1ae992cc8e75b18174ec42328897f5/deepllm/refiners.py), an oracle that “double checks” the synthesized OR-nodes or AND-nodes. Negative advice from the oracle propagates back on the reasoning chain and dismisses it.