Prolog/Terminus/Pengines

FYI, I’ve posted to the TerminusDB forums (very low-traffic, it seems).

TerminusDB is quite nice, and grafting Pengines on top of it would give the best API ever possible (how can your $language concisely support unification & backtracking - if it isn’t Prolog).

If this interests you, or you’d be willing to help to make it happen, perhaps make some noise there ?

1 Like

This needs more explanation. Tell why you think Prolog, Pengines, and Terminus are better in combination than other possibilities. Is there something about Terminus that makes Prologgy behavior easier over the net?

Simple - Prolog is the “heart” of Terminus (Terminus is - AFAIK - SWI-Prolog + prolog code + Rust module for the data-store).

Integrating Pengines should be, therefore, pretty straight-forward.

Pengines extend unification & backtracking across process boundaries, and not [de]serializing to JSON on both sides should be a gain.

Add Prolog in the browser (read the post already, it’s pretty short :wink: ), and you can write your whole app in Prolog:

  • Prolog terms => HTML & CSS
  • Prolog instead of JavaScript
  • Exchange prolog terms as messages
  • Prolog as the query language
  • possibly Prolog as the reasoner
  • perhaps use Pengines / WebProlog to distribute the app into as many layers as necessary

I had the same mind-blown revelation about Pengines recently. It could easily be the next universal API. I think it is much nicer to use than GraphQL, for example.
This lead to me writing GitHub - guregu/worker-prolog: serverless Prolog for Cloudflare Workers which implements the Pengines API using Cloudflare’s serverless infrastructure and Tau Prolog. It’s not production-ready by any means, but a decent proof of concept. I also made a Pengines client for Go: GitHub - guregu/pengine: pengines (SWI Prolog) client for Go. I think it can still be useful without the client language supporting unification/backtracking, answers are expressed as an iterator over variable substitution maps instead.
I did run into some performance issues with Tau, so now I’m porting Trealla Prolog to WASM. Sometime in the near future I should be able to hook it up to worker-prolog and get a nice fast serverless persistent Prolog interpreter. Why not SWI? There’s a 1MB limit for bundle sizes in CF workers :slight_smile:
Of course, SWI works great if you’re in less constrained environments.

If this interests you, check out GitHub - Web-Prolog/swi-web-prolog: A proof-of-concept SWI-Prolog implementation of Web Prolog which goes deep into the concepts. I think this would play well with something like Lunatic.

If you peek around you might find references to SWI-Prolog light or similar and that might get you under the 1 MB limit. :slightly_smiling_face:

That would be great. I believe a slim release for SWIPL WASM could help spur adoption. I am also a happy user of SWI. My quest is to get as many Prologs interoperating as possible, a vision similar to the original post here.

Another difference between the Trealla and SWI ports is that SWI uses Emscripten and Trealla uses WASI. This gives SWI way more APIs to work with but the serverless ecosystem is gravitating towards WASI. That being said, WASI is unstable and much more painful to port against with Emscripten. It’s missing a lot of common headers (termios.h comes to mind).
Anyway, didn’t mean to derail this with WASM implementation talk. I need to keep an eye on the other thread here.

Normally, if you get a Root Server in the cloud, you can
also install arbitrary binaries. You get anyway a virtualized
operating system with some limited CPU and memory.

That Cloud Flare has a variant of Servers in the cloud,
that promote nodejs is a kind of bandwagon, but why should
you jump on this bandwagon? One reason is that nodejs

works lock free, and the corresponding webserver doesn’t
require multi-threading. So you would possibly squeeze
a single threaded version of SWI-Prolog into the 1 MB

to profit from this model of execution. You can then scale
in ordering more CPUs and starting more workers (sic!).

Edit 03.10.2022
Disclaimer: I didn’t do this things yet, just reading the brochures:

https://developers.cloudflare.com/workers/learning/how-workers-works/
Unbenannt2

Isolates are resilient and continuously available for the duration of a request.

But I guess you could offer the same with a single threaded SWI-Prolog
on binary platforms, and changing the SWI-Prolog HTTP server, into
something that works with an event queue. (Maybe its even multi-threaded

SWI-Prolog ?), but for the end-user it would look single threaded, and it should
also feel single threaded, i.e. have the performance of single threaded, so
many of with_mutex/2 would not be needed,

also the Prolog dynamic database could use more efficient data structures,
than the current lock free concurrent data structures. Although their overhead
seems to be small. But anything that accounts for multi-threading

can be thrown over board, you can shed ballast.

1 Like

Porting to WASI is probably not that hard. SWI-Prolog does not require a lot of OS facilities. Emscripten implements a large part of the POSIX API, but a lot of it are dummies. As a result CMake configuration detects the availability of the APIs and enables the functionality that is provided by them, but to no avail as it is backed up by dummy functions :frowning: It would have been more comfortable if these functions just didn’t exist.

You can probably get the WASM below 1Mb.

1 Like

WASI takes a better approach and leaves unimplemented functions undefined (mostly), so that should help with a future SWI port!

That’s an interesting point about locks and scheduling. I have not dug too deep into the CF workers internals (they just open sourced it) but I think you are right about it using V8 Isolates. I am not sure how the other serverless runtimes work, but I’d assume that Lambda uses Firecracker which is more of a lightweight virtualization thing than an isolate kind of thing. There is a Lambda build for SWI at bkrn/prolamb that I would like to play with sometime.

Personally, the most compelling use case for serverless is that it scales to zero. I have a lot of small projects that I’d like to keep online but I don’t want to spend time/money maintaining their servers, especially if nobody uses it. With serverless you can just upload your app and it should (theoretically) just work “forever” (until your provider breaks their API :slightly_smiling_face:), and cost you $0 if nobody accesses it. The downside is that if too many people use your service it can scale too much and cost quite a lot!
I also think that new concepts such as Durable Objects are quite compelling. worker-prolog uses them to get a persistent transactional knowledgebase with little effort. However, it’s super expensive! I wouldn’t really recommend using them until they are more mature, but it’s fun to play with.

With the recent wave of effort to bring many Prologs to WASM, I think we have a really good chance of positioning Prolog as a “cloud-native” language. There are not too many languages with a good WASM story yet, so it’s a great time to get ahead of the pack.

1 Like

I am just learning web tech. I gather due to latency you often want to batch things. I wouldn’t expect Prolog variables to be transferred, but a query could be passed essentially as Prolog source, evaluated as find_all_up_to_n(N, X,Q,Xs). Such a query would not support attributed variables but would allow a continuation call to get more results, and would allow using Xs to construct additional queries, but again with all the limitations inherent to findall.

2 Likes

Pengines can do a lot of these things. Communication between Prolog and other languages remains hard and in general one should try to ensure that none of the features that have little meaning outside Prolog reach the client. That indeed concerns logical variables, constraints, continuations and probably a lot more.

The Pengine implementation provides an extensible notion of output language/format, so you can define your own output format and provide a library that translates the raw Prolog answer into this format. With a Prolog client you can do a little more, especially if the client runs the same Prolog system. It provides a fairly simple RPC like functionality.

And yes, Pengines can be remote as well as local. Local Pengines use a thread and indeed a message queue to communicate. I have not seen much practical use for local Pengines. @torbjorn.lager had Erlang in mind and later came with the WebProlog design that covers this idea better.

For @abaljeu, the HTTP protocol provides a chunk parameter that defines how many answers should be collected before they are transmitted to the client. SWISH sets this (of course) to one. The demo shell script (see swish/client at master · SWI-Prolog/swish · GitHub) set this to infinite to get all answers as one stream. When using chunking we can even redefine the size of the next chunk. That is what SWISH does if you ask for the next 1/10/100 answers. A smart client could play around with this to dynamically update the chunk size to keep a balance between getting answers early and reducing latency effects. Note that the answer holds the amount of CPU time spent, so you have a clue about the latency impact.

2 Likes

Barely. You just need a website that servers a directory structure holding all Prolog files. Their use_module/1 directives must of course use relative paths. In theory we could use e.g.

user:file_search_path(library, 'https://...').

Doesn’t seem to work as-is. Anyway it might be a dubious idea as the search process is quite expensive as we need to probe all possible locations to which this may resolve.

pack_install/1 should work for installing Prolog-only packs on the emulated file system. Would be a good idea to provide a proper default location for the packs though. Foreign packs won’t work as we do not provide dynamic linking. We could, but the Emscripten docs suggests it is typically not a good idea as notably a lot of size optimization is lost.

But, this is not very related to Pengines. The idea behind Pengines in the context of TerminusDB is to bring the computation to the data. We already have a topic and wiki page about the WASM version.