Prolog in the age of generative AI

How do people feel about the following proposition: generative AI may breathe new life into Prolog.

ChatGPT-4 is able to return its output as Prolog terms – to some degree at least. And it can “translate” prolog terms into decent English, for example.

This suggest one can reason over the inputs and outputs of the system – and connect those to existing knowledge bases, ontologies, WordNet – presumably what Watson does?

Speaking of which, does anyone have suggestions for large-coverage knowledge resources one might use?

Building a Vector Database to Make Use of Vector Embeddings

FYI: TerminusDB is built on SWI-Prolog.


Are you asking for something like The Pile?

The Pile: An 800GB Dataset of Diverse Text for Language Modeling (pdf)

Give a reason for the proposition, because I don’t see why myself.

ChatGPT is able to output Prolog terms. It also can output C++ and Python. All programming languages can be reasoned about. I don’t think ChatGPT is especially good at formal logic.

I’m not saying your proposition is false, but if it is true [and it might be through a different generative model], I would like to see how that happens.

3 Likes

FYI

LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT (pdf)

Some food for thought.

One of the speculated ideas of how GPT-4 works is as a Mixture Of Experts (MOE) (blog)

Following along those lines it could be seen that to add new experts to GPT-4 they are tried out first as an alpha or beta.

One such current alpha is the Code Interpreter (ref) but don’t think of it as something that generates Python code, think of it as an expert that has some abilities to do what some of the other experts can not do such as math or data analysis.

For details on this line of thought see

So a similar expert based on Prolog for other forms of questions/problems would be in the purview of what you seek.

If you expand this out a bit, then s(CASP), constraints, etc. should also be considered.


Note: Code Interpreter is considered a ChatGPT plugin, I also created a ChatGPT SWI-Prolog plugin as a proof of concept. (GitHub) (Details/Demo)

1 Like

Take a look at https://deepllm.streamlit.app/ for how a Horn Clause resolution-driven LLM explores a topic in depth, quickly, truthfully and hallucination free.

I have also made a quick Youtube intro to DeepLLM, showing how it works, see DeepLLM - YouTube.

The Prolog program that it generates can be pasted out and played with. Well, maybe some tabling is required if the LLM tries to bite its own tail :slight_smile:

For the academically inclined, there’s also a draft paper about it at: [2306.14077] Full Automation of Goal-driven LLM Dialog Threads with And-Or Recursors and Refiner Oracles

4 Likes

Thank you for these links. What I had in mind was something like WikiData.

Incidentally, I asked both ChatGPT-4 and Claude 2 to return WikiData item IDs for the entities contained in a certain passage (the first page of Frederick Douglass’ memoir). Both were happy to return IDs – and did a decent job at NER – but the IDs themselves were complete nonsense.

@ptarau this looks amazing and is exactly the direction I was vaguely thinking of. Could you use volunteer(s) to help with this project?

Definitely, the DeepLLM code is organized as an easy to import Python package meant to be just the core of what can be developed on top of it.

Some unexplored directions involve using the generated Prolog code as a kind of “propositional Horn Clause ontology” in which reasoning happens in more natural ways than what is done in the RDF world.

Also there are a few sketched applications in the demos folder that involve the truth_rater, an oracle that focuses the recursive explorer to stay close to ground-truth facts, extracted from wikipedia or reference documents.

Other practical things would be to control DeepLLM from SWI-Prolog, via PySwip to take advantage of Prolog’s full reasoning machine on the code that it extracts from the LLM’s “mind”.

1 Like

FYI

Paul has noted this on the OpenAI Discourse forum

also check there for others interested in this.

Since papers on Prolog and LLMs and/or neural networks are not common and related to this topic, just noticed these three by the same authors.

“Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text” by Zhun Yang, Adam Ishay and Joohyung Lee

“Leveraging Large Language Models to Generate Answer Set Programs” by Zhun Yang, Adam Ishay and Joohyung Lee

“NeurASP: Embracing Neural Networks into Answer Set Programming” by Zhun Yang, Adam Ishay and Joohyung Lee


Note: These were in my daily check of this query.

https://arxiv.org/list/cs.SC/recent

1 Like

FYI

Noting here as readers of this post may wonder if such exist, created by user RdR.

Prolog to GPT API

Thanks for the references, Eric. I will be doing further work on Prolog interfacing with LLMs soon.

I got a bit sidetracked with Prolog to GDELT, which I also want for my project.

As I often express, all credit goes to you for your hard work. I merely provided a reference.

I’m delighted that I could assist you, especially since I recognized that what you were doing at that time was innovative and ahead of the curve. It’s great to see that your efforts are finally coming to fruition.


I hope this is correct

Yep, that’s the one. It turns out that the best access to GDELT is through Google’s BigQuery, which is why I wrote the pro2sql library.

I was hopeful for google’s logica language, but it turns out to be in the datalog family with a much more elaborate syntax that would be even harder to interface with (as far as I could tell).

Paul,

I’ll post here because I’m unsure where you’d prefer to discuss your system. If useful I could set up a Discord instance.

In any case: one thing I’m noticing is it’s not fast. Do you have a sense of where bottlenecks may lie?

The main bottleneck is the time the LLM takes to answer (about 0.5-3 seconds or so). If the recursion depth is >=2 that accumulates quickly, especially if Rater or Advisor oracles are used to constrain the dialog thread. After that, everything is cached and running again with the same query is almost instant. Running locally (I do that with Vicuna on a Linux with am RTX 3090) is comparable in performance. Running the OR-nodes in parallel would be an easy way to speed things up, provided that the LLM accepts a large number of concurrent queries.

Understood. And thanks for preempting the question about running locally on a 3090, which is exactly what I was going to ask.

FYI

You both should be able to send direct messages, AKA Discourse personal message, e.g.

click the circle icon for a user

image

click Message to compose a personal message.

1 Like

Thank you for sharing links to your work.

I am skeptical that it’s possible to do Resolution with free-form text, such as that generated by a language model (large or small!) so I had a look at your paper ([2306.14077] Full Automation of Goal-driven LLM Dialog Threads with And-Or Recursors and Refiner Oracles) to see how you did it. I have to say that I remain skeptical, in particular of the claim that your approach is more than superficially similar to, or perhaps inspired by, SLD-Resolution. For example, your paper claims that:

“Our approach will follow closely the SLD-resolution algorithm for pure Horn Clause logic [13,12].”

I hope that you will welcome my criticism. If so, here are the reasons for my skepticism.

Perhaps the most important reason for my skepticism is that Resolution, including SLD-Resolution, is known to be sound, and refutation-complete, or complete with subsumption. We know of no other inference system with this property that consists of a single rule and is therefore “machine-oriented” (i.e. computable by a simple algorithm). I cannot see where your approach follows “closely the SLD-Resolution algorithm”, and where it is that it provides such guarantees.

To begin by finding some common ground, let’s clarify that “SLD” Resolution means Linear Resolution, restricted to Definite clauses and with a Selection rule. Linear means that resolvents are always a goal derived in the last resolution step, and a definite program or unit clause from the background logic program. A “Selection rule” selects the next literal to resolve upon. In SLD-Resolution as implemented in Prolog the selection rule is that literals are resolved from left to right, from the head to the body, of a clause.

Resolution of course is the Resolution rule, as described by J. Alan Robinson: given the conjunction of two clauses ¬φ ∨ χ ∧ φ ∨ ψ, eliminate the contradiction ¬φ ∧ φ and derive a new clause χ ∨ ψ.

It seems that your work eliminates each of those assumptions:

  • The Resolution rule is nowhere to be found. It seems replaced by “LLM-driven dynamic clause head creation” (page 2, Introduction).

  • Definite clauses are eliminated as “the underlying logic is propositional” (idem).

  • The selection rule is eliminated: “SLD-resolution’s clause selection via unification is replaced by LLM-driven dynamic clause head creation” (idem).

  • There is no mention of the Linear character of SLD-Resolution.

Careful reading of Sections 4.1 (“The recursive descent”) and 4.5 (“The Logic Programming Connection”) further confirms the above observations. The text claims that the effect of the OR and AND prompters in expanding a head h (not defined earlier in the text) “can be described” as sets of Horn clauses. But, why is that the case?

I also do not understand the following claim (from the abstract):

  • At the end, the unique minimal model of a generated Horn Clause program collects the results of the reasoning process.

A “unique minimal model of a generated Horn Clause program” must refer to the Least Hearbrand Model (LHM) of a logic program, i.e. if Π is a logic program, the set of (logical) atoms A such that Π ∪ {→A} ⊢ □. But, instead of such a set of atoms, your system is said to return “traces of justification steps clearly explaining where they are derived from”. Where, then, is the LHM generated?

Perusal of the (few) empirical results in the Appendix reveals that the atoms derived are sentences in natural language surrounded by single-quotes, thus capable of being parsed as Prolog “facts” by a Prolog interpreter, for example:

‘Symbolic representation learning’.
‘Knowledge-based systems’.
‘Automated theorem proving’.
‘First-order logic inference’.

These are, I suppose, propositions, so they can be seen as ground atoms in the Herbrand base of a predicate- but, which predicate is that? What is achieved by “atomising” natural language sentences like that?

I suspect these are criticisms that have been addressed at your work before and I’m curious what are the responses you have formulated.