LLM and Prolog, a Marriage in Heaven?

A couple of years ago there were a number of papers that used the same approach for planning: they basically translated some natural language instructions to PDDL (the Planning Domain Definition Language, used by all mainstream planners) or, alternatively, Python, then they passed the result to a planner or to a robot’s API.

For example, see the following preprint:

Despite the claims in that paper subsequent work showed severe problems with the approach. See the following for a review of planning using LLMs:

Since the paper you cite is following the same approach, except that it translates reasoning problems to definite programs rather than PDDL and handing off to a Prolog interpreter rather than a planner, I anticipate the same failure modes as with the earlier work- which btw looked like it worked until some experts on planning had a look and pointed out the pressure points that cause the whole effort to collapse.

The problem in general is that LLMs cannot be relied upon to do the translation to a formal language accurately, unless they’ve already seen an accurate translation of what they are called to translate. Translating natural language to a formal language itself requires decision-making that implies understanding of both languages and the domain of discourse and such understanding is absent from LLMs, in novel domains. In other words, the proposed approach might obtain a decent Prolog boilerplate geneator, but it will be brittle and easy to break with simple techniques (like changing the names of symbols as in the obfuscated blockswords domain used to demonstrate the brittleness of LLMs-as-planners).

3 Likes