ChatGPT SWI-Prolog assistent

I have created a SWI-Prolog assistent on ChatGPT. You can find it on ChatGPT - SWI Prolog Assistant

I just gave it some obvious information on what I’d like it to do. I gave it two questions, one to read a CSV file and one to create a web service to exchange JSON. I won’t say the results are the perfect, but they were both pretty close and useful.

So, please give it a try. If you have suggestions for creating a better one, please help. Please share your experience. If proven useful, I think we should consider integrating it into the SWI-Prolog web site.

8 Likes

I did some interaction with the gpt builder and it comes with this list of issues for which it considers the documentation weak. I’m quite impressed and copied it here to see what we can do about this. Ideas?

Here’s a summary of advanced SWI-Prolog topics that are powerful but often under-documented or scattered across forums, internal docs, or academic papers:


:wrench: 1. library(tries) for Efficient Term Indexing

  • Not just for tabling; supports explicit insertion, enumeration, and retrieval of ground terms.
  • Often underused for large-scale indexing tasks.
  • Sparse documentation on trie_gen/3, especially how to extract patterns efficiently.

:abacus: 2. Constraint Solvers (clpfd, clpq, clpr)

  • While basic usage is well-documented, advanced propagation tuning, labeling strategies, and integration with DCGs or tabling isn’t.
  • Fine control over domain shrinking or symbolic constraints can require diving into source or papers.

:package: 3. Module System & multifile/discontiguous Pitfalls

  • Complex when dealing with meta-predicates and cross-module predicate visibility.
  • Module-qualified calls and operator declarations across modules are tricky.

:light_bulb: 4. Tabling with Custom Strategies

  • :- table is well-documented, but call subsumption, answer subsumption, and incremental tabling require careful study.
  • Few examples show integration with library(tries) or with clpfd.

:thread: 5. Engines and Coroutining (engine_create/3, freeze/2, when/2)

  • Powerful for user-defined control structures, lazy evaluation, and managing concurrency.
  • Sparse use cases outside academic papers or advanced demos.

:globe_with_meridians: 6. Foreign Language Interfaces

  • library(ffi), C-interface (SWI-Prolog.h), Python interface (pyswip), and Java (JPL).
  • Setup and cross-platform handling is challenging; SWI’s PL_register_foreign() semantics often require diving into examples.

:bar_chart: 7. Message Queues, Threads, and HTTP Servers

  • Thread communication via message_queue_create/1, thread_send_message/2, etc., is robust but poorly abstracted.
  • HTTP libraries (http_dispatch/1, http_server/1) often have undocumented integration with Prolog’s logical control flow.

:chart_increasing: 8. Just-In-Time Indexing (JITI) and Performance Tuning

  • Very little practical documentation on interpreting statistics/0, profiling reports, or JITI heuristics.
  • Indexing choices (e.g., first-argument indexing vs. manual term_hash/2) are crucial but subtle.

:magnifying_glass_tilted_left: 9. Term Expansion and Macros

  • term_expansion/2 and goal_expansion/2 offer compile-time DSLs but are brittle across module boundaries.
  • Used for meta-programming, but rarely explained in idiomatic SWI-Prolog terms.

:scroll: 10. Documentation Tools (pldoc)

  • /** */ comments and structured tags are flexible, but advanced features like documenting operators, foreign predicates, and generating HTML sites aren’t well-covered.
3 Likes


Consider these three tools for use in creating better prompts

The three tools AFAIK are still proprietary to hex.tech so one would have to create them, however they do not appear be so complicated that they could not be recreated.


Funnel: decomposes each eval from a binary outcome of pass/fail into a series of cascading steps, each with its own pass/fail criteria.

Flux: Flux is our quantitative measure of movement through the funnel. We look at flux both in aggregate, to quantify the net outcome of a treatment on our funnel, and broken out by stage, to see how evals are transitioning from stage to stage.

Orbit: The orbit chart visualizes individual evals as they move through “orbits” representing the funnel, with earlier stages closer to the center. It’s an extremely information-dense view of an experimental result.


Images

Funnel

Flux

Orbit



Getting free tokens for use with OpenAI API (which is not ChatGPT) to work on creating better prompts. (The noted time ends tomorrow but OpenAI has extended this offer a few times already).



I would also consider adding

  • s(CASP): Goal directed Constraint Answer Set Programming. (SWISH page)
  • Persistence: library(persistency): Provide persistent dynamic predicates
  • Multithreading code: Often SWI-Prolog is single threaded, should be able to make the code multithreaded when advantageous. (Building a port scanner)
  • Converting code from other programming languages.
  • Identify OS back ground tasks that might slow down processing, e.g. Microsoft Defender on Windows (ref)
  • Bug hunting toolbox notes many techniques to help with bug hunting.

Since ChatGPT conversations can be shared, trying this with the free version of ChatGPT to see if others can also access this with a different free version of ChatGPT.

Note: The above link was not created with the option to make link discoverable. One must have the link to be able to access the shared ChatGPT conversation.



TL;DR

The reason for this test is that sharing LLM conversations has been hit or miss depending upon if one has a paid or free account with the various LLM companies (Google, Microsoft, OpenAI, Anthropic, etc.), and the changes over time of how sharing works. Also if the creator of the share deletes the conversation from their history then the share becomes invalid.

In trouble shooting others prompts, getting a shared link to the conversation is often more useful as the conversation can be continued with the linked version and as often the case with questions on forums critical information from the conversation is often left out of the post, seeing the actual conversation is more productive. Also trying to recreate prompts with all of the nuances in a forum is hard and error prone.

This does require reviewed questions and responses, no?

That requires using the API (if I read correctly). It would be nice if we could do so as it would allow integrating the GPT with the web page and allow anyone to use it for free. The costs are prohibitively high though. With free tokens this might get manageable. If this is a program that can end any day it is not very interesting though. Privacy is another issue. We’d have to be very explicit that there is this free service in exchange of your privacy.

I’m pretty convinced that SWI-Prolog Assistent is useful. Already as it is today and it will only get better. I also think we need it to compete with more popular languages. How to make it available and how to improve it are mostly open questions. One thing is sure: even if we do nothing it will improve :slight_smile:

Yes.

Note, I have not used these tools, just read the blog entry. The takeaway for me was that this moved the idea of creating a prompt and knowing if it worked or not based on evals, to getting more specific feedback to know what part of a prompt was working or failing.

I think of evals as similar to test cases but for AI prompts.

Correct.

Since evals are like test cases, the need to rerun them or automate the running of them is needed. Trying that with ChatGPT alone is not practical, thus the mention of the API to assist in creating prompt(s).

As models change or are removed, current prompts might downgrade in effectiveness or just fail and so evals become useful.

OpenAI also has a playground (which requires an OpenAI API account) to assist with creating prompts.

The only way I know of to have the SWI-Prolog web site with an integrated OpenAI chatbot to help would be via the API (two options)

  • a user provides there own API key
  • you prove the API key and somehow pay the bill.

This is how Discourse does it with their AI bot, the semantic search is provided for free as the cost is so low they can absorb the cost but for other features one has to provide an API key.

You could add a link to the ChatGPT GPT (assistent as you call it) in the documentation menu, but how often do users actully use and/or know to look for such?


Discourse semantic search using AI is accessed by starting a search, then selected advanced search which brings up



Without the proper income to offset the API cost, I would agree thus why bring your own key (BYOK) is quite popular. The prices for API use have been coming down over the years.

True.

I would not use free tokens for anything other than development work.

1 Like

Thanks. As a start I published the assistent, so anyone can now find it in Explore GPTs. From here, it seems we have three routes

  • Simply add a link to the website.
  • Embed on a page using an <iframe>.
  • Use BYOK

If anyone feels inspired to make it better, please do. I’m happy to support by providing material, website integration, etc.

1 Like

One further option with BYOK is that one could create agents. Many of the LLMs support agents in some form, e.g. OpenAI agents, and the LLM companies are embracing a standard, Model context protocol (MCP) (Anthropic, OpenAI, Google, etc.) .

With agents one can create workflows, use RAG, test compile code such as Prolog and feed errors back to the AI to fix before involving the user, etc.

I would suggest you check https://openrouter.ai/ they provide more than 300 models with about 60 free models with 200 free requests per day on their api. Even with the paid models it is very cheap. Openrouter.ai does not serve models themselves, but they connect to many backend inference providers and it seamlessly switches between them without having to get API keys for each provider. It allows you to choose many models, some are better than others for specific tasks, and you can control many parameters through the api. Recently they added the ability to send pdfs directly to their api.

This is much better than chatgpt because you can learn to use opensource models, even though less powerful. openai (chatgpt company) models are also provided if you want; but openrouter.ai prevents vendor lock-in or problems when openai changes their model and your prompt stops working. In addition you can easily switch models for different tasks. All in all I think that is a better and more flexible way to use LLMs. The api is OpenAI compatible which is pretty much standard out there.

Here are the API docs quick start with a curl example if you click on the shell tab: OpenRouter Quickstart Guide | Developer Documentation | OpenRouter | Documentation .

The best open source models I have found are:

Best Coding open source models (non-“reasoning”)

  • paid deepseek/deepseek-chat-v3-0324 $1.10 per million output tokens
    ** sometimes I switch to the paid version if I have to input more than 4K tokens and expect more than 4K tokens out, because some of the free providers are not reliable for longer contexts.
    ** avoid the Targon provider for this model as it is not realiable most of the time: add {"provider": { "ignore": ["Targon"] } } to the json api call

  • free deepseek/deepseek-chat-v3-0324:free
    ** avoid the Targon provider for this model as it is not realiable most of the time: add {"provider": { "ignore": ["Targon"] } } to the json api call
    ** useful for small input/output 4K tokens in/out it may work up to 8K and providers may solve this in the future

  • the qwen-2.5-coder models are all good, free and paid (new qwen3 models came out today but have not used them)

Best “reasoning” open source models:

Workflow

My usual workflow for coding is like this:

  1. Use a reasoning model to brainstorm requirements: I use deepseek r1 free or paid for this, I get a list of requirements from it (i.e. I want a command line tool that does such and such please help me workout the requirements for it), I usually ask questions and ask it to tell me about any requirements it thinks I forgot or should add.
  2. Use deepseek R1 again and ask it to produce a PRD (product requirements document) pasting the list of requirements from step 1
  3. Use deepseek-v3 and ask it to code a program that implements the PRD below (paste the PRD from 2) along with a complete unit test
  4. manually fix the output of 3 and use the unit test from 3 as I fix it manually

This has worked very well for me.

Local models on your machine

I use the qwen models when I have to run a model locally on my machine due to security reasons. To run locally on my macine I use llama.cpp along with llama-swap. The qwen models are smaller and I can fit them on my machine, but it is much slower and not as high quality as the deepseek models, but it is good enough.

Code completion helper

I also use llama.vim with a tiny local model (qwen 1.5b) for code completion in vim/neovim and it works really nice.

1 Like

After fixing the trivial bug nb_set_to_list bug? · Issue #1364 · SWI-Prolog/swipl-devel · GitHub, I thought to ask the SWI-Prolog assistent. After entering the bug report it first created a nice reformulation of the bug report. Then I explicitly asked it it fix library(nb_set) for this and … it did! Ok, I have no clue why this bug happened. It looks like a plain and trivial typo. But still, it found it easily.

1 Like

Care to share the conversation so that we can see what you did?

Also others can learn from your prompt(s) and possibly give some feedback.

On the OpenAI Discourse forum they typically host some topics where users can share images, what they are working on, prompts, etc. Could be something we could do here.

Enjoy. I’m a bit embarrassed by this trivial bug :frowning:

1 Like

Did you ask ChatGPT before or after you had committed the fix to GitHub?
It’s interesting that it reformatted KIndex1 is KIndex+1 as KIndex1 is KIndex + 1.

After, but briefly (10mn?). Considering it came with the errornous +2, I assume it picked the old code. Not 100% proof of course. Next time when I think it makes sense I’ll start with ChatGPT.

1 Like

In looking at the shared conversation you posted it includes

image

at the end and in looking at the list

Note: The list was longer than in the screen capture

I do not find any that are for GitHub, however it could have picked up some code from one or more sources and/or hallucinated the code.

Once when passing in a file about magma in Python the AI could not find the file and hallucinated the entire Python file. Others have also reported entire research papers being created as PDFs but in checking the people, facts etc. in the paper it was obvious it was a hallucination.

When I am working on code fixes, I typically turn off access to the web, GitHub and such unless I specifically want that. Then I include only the code needed. :grinning_face:

Another data point. I wanted to write a unix-domain server/client, so I’d ask the assistent. First answer was not good, mainly missing the correct names and arguments of the predicates. So, I simply said

Could you please read through library(socket) and refine your answer?

After this, the answer is pretty good :slight_smile:

1 Like

When having an LLM create code also consider asking for test cases at the same time; often asking for test cases for each predicate.

As you noted with the socket library, noting to the LLM to use

created test cases in the style of SWI-Prolog.

In some Python code that I asked to be converted to SWI-Prolog in running the included test cases one failed and the SWI-Prolog code converted from Python had a bug. In noting the predicate, input and incorrect result the LLM was able to correct the code.

To have more options for creating LLM assistants related to SWI-Prolog, I created one using Anthropic Claude. While Claude does not have an exact equivalent of a ChatGPT GPT, the closest equivalent is a Claude project.

Here is a share:

https://claude.ai/share/22425521-5ecb-4f7f-afcf-a14c35d00c3a

I am interested to know if others can see the conversation and what type of Claude account was active when it was clicked, e.g.:

  • None
  • Free (No login)
  • Free (Login with account)
  • Pro
  • Max
  • Team
  • Enterprise

Notes:

  • While a ChatGPT GPT shared conversation can be continued with further prompts using a free logged in ChatGPT account, these shares for Claude cannot be continued, regardless of whether the account is free or Pro, not sure about other Anthropic accounts.
  • To create such shares, one must have a paid account, whether for ChatGPT or Claude projects.
  • OpenAI has GPT Store to find GPTs, Claude has no equivalent AFAIK.
  • To access the free version of Claude without setting up an account to login, use Google and search for Claude then click the sponsored link. This appears to be an aggregation site of multiple LLMs, still is free and no login.


The Claude project instructions for this share

You are an SWI-Prolog expert programmer.
* Use only SWI-Prolog code instead of generic code.
* Verify each predicate is from the reference manual or listed in the output.
* SWI-Prolog reference manual - https://www.swi-prolog.org/pldoc/doc_for?object=manual
* Add comments in code.
* Optimize code to be time efficient.
* If asked to create unit test, create them in the style of SWI-Prolog - https://www.swi-prolog.org/pldoc/doc_for?object=section(%27packages/plunit.html%27)
1 Like

Here is a Claude share similar to the Unix socket share.

ChatGPT GPT share:

Claude project share:

https://claude.ai/chat/098b772b-4f2b-407d-9c29-a9652756c054

As I do not use SWI-Prolog on Unix or create Unix sockets enough, I can’t really be sure which parts of the SWI-Prolog code are correct and/or better.