Personal notes on health, AI, and the direction of SWI-Prolog

Dear SWI-Prolog user,

This is set of rather personal notes. Some users may have noticed
changes in how I manage SWI-Prolog. Most recent contributions have
been implemented by Claude Code and I’ve not been very active on the
forum. The reason for this is that I was diagnosed with rectal cancer
in August last year. End of April I had the main surgical intervention.
Together with some complications this caused a lot of discomfort,
notably not being able to sit for more than 10..15 minutes. There
will be some more uncomfortable treatment. The good news is that,
unless I’m very unlucky, I should fully recover.

Not being able to predict how I would function during the various
treatment steps as well as problems concentrating caused a serious
change of activities. Except for small tasks for long-term commercial
customers I have not done any commercial work in that period. Instead,
I concentrated on improving SWI-Prolog, the tooling in particular because
it distracts, is fun and I do not have to deal with deadlines.

After the main surgery, traditional programming was not an option.
This is a perfect moment for contemplation and trying new paths.
Inspired by @EricGT I have been exploring the use of AI, Claude Code
in particular. I explored ChatGPT about a year ago to conclude it
might be nice for student exercises, but not for programming. Half a
year later I used ChatGPT for porting the XPCE GUI library to SDL3.
That was a better experience, although overall still a mixed blessing.
It was mostly good at finding the right API calls from SDL3, Pango and
Cairo, but plumbing them the wrong way. Pointing it at a bug
typically resulted in a completely different new implementation that
was buggy some other way, so eventually I had to do most debugging.
Agentic AI, first OpenAI Codex, is a game changer though. I quickly
switched to Anthropic’s Claude Code, mostly because they seem to be
most aware of the ethical implications of AI and are considered
state-of-the-art.

I used Claude Code mainly to enhance the tooling, mostly because it is
fun and doesn’t require too much deep thought. Working with the AI
fits my current physical state perfectly: sit down shortly to review and
adjust the AI generated plan, review code and test it then type a few
sentences of new instructions. Next, one can do exercises or other
small tasks while Claude Code does the work for a few minutes or
sometimes as long as several hours for writing a random test suite for
the commandline editor, simplify a failed run and fix it; repeat until
1,000 random edits show no issues.

Main tasks tackled (contemplation on that below):

  • Get libedit (the commandline editor) to work properly on normal
    terminals, the Windows console and the Epilog console.
  • Make Epilog deal with 256-colour and direct-colour modes, strike-through and
    proper link (hover) feedback. Required significant changes to the basics
    of the terminal. Implemented by Claude Code with minimal
    intervention.
  • Get libedit, XPCE and Prolog to deal with the full Unicode range,
    including double-width glyphs (Emoji, several Asian scripts) and
    Unicode combining characters. This included changing the text
    representation used in XPCE. All this was implemented by Claude
    with minimal intervention from my side. It tends to use a bit too
    much copy/paste programming, but if you point at that it is happy
    to generalise and reuse.
  • Implement a Unicode symbol picker in XPCE. It needed some help
    and a few iterations, but programming in XPCE+Prolog is not really
    mainstream!
  • Deal with refining the Prolog Unicode syntax extension. Both
    implementation and design. Handle feedback from the PIP working
    group on this topic.
  • Update XPCE’s look, notably by designing SVG icons and adding support
    for SVG icons to XPCE. I’m not really impressed by Claude’s ability
    to design icons. It took a lot of guidance and is still far from
    the quality that a human graphic designer would have produced. It
    does look less 1990 style though :slight_smile:
  • Implement 2D transformation matrix in XPCE, allowing for scaling,
    rotation and shearing. This is a giant task that was completed
    in two days with only a short planning stage and few prompts.
  • Same for implementing opacity for graphical objects (much simpler
    though).
  • Refactor the XPCE reference documentation that was stored in saved
    binary blobs representing XPCE objects. Its task was to extract
    the documentation, store it in Markdown files and add a plugin to
    PlDoc to make it understand the XPCE documentation conventions.
    This required quite a bit of guidance. But then, it concerns a
    lot of legacy material in a rather alien environment.
  • Modernise the documentation of many classes, notably updating it
    for the changes due to moving from X11 to SDL3 (still need more
    work).
  • Refactor the MacOS binary from a bundle to a .pkg installer
    and include all the automation to sign the package. I had started
    to do this by hand, but gave up as it was too complicated and
    frustrating. Claude did the job quite quickly and with not much
    intervention.
  • Fix quite a few bugs. I’m most impressed by Claude’s ability to
    fix bugs. With very little input it often finds the culprit
    remarkably quickly. It is not always right about the details
    nor the fix. Even if it is wrong, its analysis helps a lot in
    nailing the problem. These days I start giving it the URL of
    a bug report and see what it comes up with.
  • Write a lot of (PlUnit) tests as well as writing, improving
    or updating documentation.

Contemplation

What did I learn? First of all that agentic coding systems have
matured immensely. In my experience to the level where these tools can
no longer be ignored. Is this good or bad? First of all, it just is
there
and we cannot uninvent it. I was first considering this a
threat as it makes what I liked doing for over 40 years redundant.
Right now, I’m quite positive. Claude does a lot of the boring work
that also belongs to system development: bug hunting, writing tests,
updating documentation, reading through API documentation,
investigating portability issues (which platforms support X, if I
drop support for an outdated API, would it affect any still supported
platform, etc.). Instead, I can concentrate on what I want. If I
want to do some programming I still can. I can leave the finishing
touch to the AI though: write tests, fix issues and document; the
tasks that typically take most of the time and I like doing least.
I tried this and it was a pleasure!

What about our profession? I learned programming by doing small tasks
in the then young XPCE GUI library developed by Anjo Anjewierden. I
learned a lot by looking at his clean programming style and design.
Richard O’Keefe taught me a lot from work I did on his Thief editor
(a very lightweight Emacs clone used in Edinburgh in the 1980s, when
Emacs was an abbreviation for Eight Megabytes And Continuously
Swapping
) as well as the many comments he had on SWI-Prolog. In
short, I learned by imitating and comments from giants. Now I
have the experience to prompt the AI, keeping an overall overview
of where the system should move, its architecture as well as general
principles on maintaining good programming style. How will young
professionals gain these skills?

These developments do raise some existential questions. I’ve spent
quite a bit of time on SWI-Prolog’s development tooling and made a lot
of progress with help from the AI. However, the AI doesn’t need most
of this tooling for writing Prolog programs … Crash reports with
good diagnostics are useful to the AI, but it doesn’t need a GUI
debugger, tooling that provide insight in the program structure,
editors, syntax highlighting, etc.

I tend to believe there is still a role for Prolog itself as it is
a good language for knowledge intensive tasks that require reliable
reasoning. For any such task, a Prolog program will outperform AI
by a large margin while providing trustworthy results rather than
results that are 95% correct, but wrong in some weird way in the
other 5%. Formalised as Prolog typically makes the logic easier
to assess and verify than imperative languages. I do believe most
of the Prolog code will soon be AI generated though.

Final words

I hope (and suspect) that I will soon recover enough to resume work
without constraints due to concentration problems and not being able
to sit comfortably. I’ll continue work on SWI-Prolog and hope to
restart commercial work in a couple of months. Donations (thanks!) as
well as commercial work are important to keep the project financially
healthy. Commercial work is also important to find new challenges.

I wonder about the overall direction in which (SWI-)Prolog should
evolve. Improving tooling, aiming at human program development seems
not a good investment. Possibly the tooling still has a role
maintaining an overview of the code? Tools like the profiler,
coverage analysis and static analysis can still play a role in
assessing bottlenecks, test completeness and potential bugs. With
modern graphics, XPCE can again play a role in providing interactive
GUI applications as the AI is quite capable of doing so.

I’m considering some refactoring of the Prolog core. Notably the way
engines/threads are threaded through the system is awkward. Too much
work to do by hand, but if the AI can do it, it is worth considering.
Actually exploiting the 64-bit Prolog cells we have now (also on 32
bit platforms) rather than using indirections to make all data types
fit into 32 bits can simplify the system and provide some performance
improvement. Implementing proper clause indexing for typical Prolog
predicates using a handful of clauses and for which first argument
indexing does not work well is another many times delayed plan. More
static analysis tools? Ideas on what has most impact are welcome.

Enjoy --- Jan

P.s. I wrote this myself :slight_smile: I did ask Claude to fix spelling and grammar …

Best wishes for your recovery. I’m certain that whatever direction SWI-Prolog eventually progress towards, it will be much appreciated by this community.

Ian

Thank you so much for all your work on swi-prolog and especially the tooling side.
The gui debugger is one of the most used piece of gui software that I use regularly with prolog and is a very precious tool to actually understand intuitively how prolog execute.

I can only join to these best wishes.

Considering what you’ve been faced with, your responsiveness to questions and workrate has been astonishing. I hope you continue to make a rapid and full recovery, and thank you for all the help you’ve given me and the entire community - I think it’s safe to say we are all rooting for you!

I found your experience of and musings on AI interesting, and they closely parallel mine. Like any new tool, how you use it needs care and consideration.

That’s exactly my experience. I’ve written a fairly old-school Expert System in Prolog that looks for configuration issues and unused resources in a large Cloud estate. The incoming data is JSON, the output is a set of JSON “symptoms”. The rules are mostly simple and are all self contained, a very good use case for Prolog - and yes, it’s fast. The next step was analysis, grouping and remediation planning - perfect for a LLM, right? Hmm, up to a point. Whilst the LLM is excellent at planning and prioritisation, its attempts at remediation were either OK-ish or catastrophically bad - and you never knew which you’d get. For example it would happily delete storage resources (which can’t be recovered) whilst point-blank refusing to delete Compute resources (which can be recreated). When I got it to agree to delete Compute resources, it then decided that the best way to fix simple attribute-edit problems was to completely delete and recreate the resources… :exploding_head: I ended up getting it to write an “Architectural guidance for fidelity-critical LLM workflows” for me, the crux of which is that LLMs need to be the “Jam in the sandwich” between deterministic input layers and deterministic output layers - both of which Prolog is an excellent match for.

On the flip side, I now have a SWI-Prolog plugin for IntelliJ, entirely LLM generated, I haven’t even looked at the code. It does most of the things I want - code formatting, symbol search, setting breakpoints & handoff to the SWI debugger etc. However it took prompting spread over many days and 135 commits to get there, and burned a frightening number of tokens. But on the other hand I had no interest in becoming an IntelliJ plugin developer.

Much of the current LLM buzz is around LLMs as coding agents, and I think that use case flatters to deceive. Codegen is usually a linear sequence of smaller iterations - there are a bazillion ways to implement something, and if the LLM doesn’t get it right it can just be told to try again. Business processes however have to be predictable, repeatable and auditable - all areas that LLMs are weak on. A blended approach as in the “Jam sandwich” model is pretty much mandatory if you want to avoid chaos. and I think the low impedance mismatch between Prolog & LLMs makes Prolog an attractive option.

I’ve just done exactly that, several times:

  • I needed to add support for some more Cloud resource types, I pointed the LLM at the Cloud REST API docs and some examples of existing handcrafted Prolog code and the result was probably 85% of the way there.
  • I was asked to add CLI invocations to the output of my Cloud Resource analysis system, I pointed ChatGPT at the CLI docs and the Prolog source of the system and told it to generate a DCG, again it got pretty close on the first pass.

There are weaknesses however, it often misses that there are standard library calls to do what it wants and it ends up spewing out a lot of unnecessary code. If you point that out it will self-correct, but you certainly need to review its output carefully. I suspect that’s because, relatively speaking, there’s far less of a Prolog code corpus for it to be trained on.

It’s not “core” functionality but how about a MCP Server library as a first-class citizen of the SWI ecosystem? There are some things out there but they seem mostly about providing access to a Prolog interpreter via MCP, rather than being a framework for writing MCP servers in Prolog. Most of the bits you’d need (HTTP, JSON) are already there and there’s already a sophisticated HTTP server framework in SWI.

Dear Jan,

People who know you will not be surprised by the strength, good spirits, and generosity with which you have carried all this. But that does not make it any less admirable!

I just want to add some thoughts on Claude, trying not to oversimplify it. Photography automated faithful visual reproduction but it did not destroyed painting, but changed its center of gravity. AI coding assistants automate much of routine code production, but does not eliminates the need of human judgement. I like the analogy because painting was never only “making realistic images”, just as programming was never only “typing code”. Painting involves vision, meaning, composition, and taste; computer science and software engineering involve modeling, abstraction, correctness, architecture, constraints, and responsibility. AI reduces the value of merely producing code, but it increases the value of knowing what code should exist, why it should exist, and whether it is good.