Some questions about wasm

I have some questions about the effort to bring swim prolog to WebAssembly.

(I was not smart enough to figure out how to contribute to the wiki so this is a separate topic)

I assume that SwiProlog is using wasmgc. Is that correct?

I can see some negative about doing that; but I am curious about it.

IIRC, Prolog itself does not have the concept of coroutining, via some kind of stack switching mechanism. If it does, are you doing a CPS transform?

I ask because I am part of the WebAssembly standards group.

More generally, I would like to know what kind of issues arose in bringing swipe to wasm and what kind of features would be most helpful.

Great! Let me try to answer some …

No. SWI-Prolog does its own GC and uses malloc()/realloc()/free() to manage system memory. Note that SWI-Prolog has three garbage collectors:

  • Traditional stack GC. The stacks are simply large arrays. They are compacted by GC and optionally enlarged or reduced using realloc(), in which case all references are updated (that is called stack shifting).
  • Atom GC. Atoms are individually malloc()ed and accessed as indexes in a dynamic array. Atom GC invalidates entries in that array and free()s the linked atom. New atoms reuse the index. Normally Atom-GC runs in the thread gc, but in a single threaded environment it is of course the main thread.
  • Clause GC deals with erased clauses. These are merely marked on retract/erase and wiped from the chain by Clause GC. The threading works the same as atom GC.

Coroutines in the WASM version are implemented using engines. An engine is basically a Prolog thread, but not connected to an OS thread. Otherwise it is a set of Prolog stacks and a VM instantance. Engines can yield, which causes the VM loop (PL_next_solution()) to return. The engine can then be detached and a new engine can be attached. An engine that has yielded can simply be resumed by calling PL_next_solution() again. In the WASM version, engines yield to wait for some browser event (await/2) and they auto yield after N inferences.

Back to this.

  • The engine yield was implemented in an experiment to support Rust coroutines. That was not that hard, except that we cannot yield if there are intermediate C frames, i.e., Prolog calling C, calling Prolog. While the outermost can yield, the innermost cannot. If that was possible, integrating the debugger in the browser version would have been trivial. Now I had to extend the yield mechanism to work from all trace event ports, which was far from trivial. And still, mutual Prolog/C calls are used at various places in SWI-Prolog and this does not yet work as it should in the WASM version. Async C code would help, but as is, it seems to blow up the system and make it a lot slower.
  • Full 64-bit support. I have not checked the recent status … The problem is less urgent since I moved to 64-bit Prolog ā€œcellsā€ in the 32-bit version. Still, the days you’d like to give more than the 32-bit WASM address space to Prolog is approaching :slight_smile:
  • I need a way for non-local transfer of control in case of some fatal error. I tried structured exception handling, but this caused a major slowdown. I’m now using setjmp()/longjpm() using a small trampoline function. That works fairly well. Before, that was integrated into the VM main loop, but that too caused a major slowdown.

All in all, it now works quite well with Emscripten. It is a pity that Emscripten is based on Clang rather than GCC as GCC does a way better job in optimizing SWI-Prolog :frowning: Debugging is no fun either :frowning: An issue with Emscripten is that it provides almost all POSIX functions, but quite a few are stubs that do nothing. That breaks CMake based feature detection. It is nice for a quick port, but eventually you need quite a few if(EMSCRIPTEN) :frowning:

These issues are more related to Emscripten than the WASM standard, I assume.

Besides Jan’s excellent answer, I’d like to add some comments based on my experience as a developer of Ciao Prolog. Most Prolog engines and virtual machines are written in C and use many implementation tricks (such as tagged pointers, computed goto, longjmp, etc.) to achieve good performance. The semantics of Prolog are somewhat dense and cannot be translated one-to-one into imperative code (see ā€œunificationā€), so compiling to bytecode (WAM-style) is often a very good compromise. Note also that memory management in Prolog is somewhat special due to backtracking. We did some early work in Ciao compiling Prolog directly to JS (without bytecode), which could be retargeted to WASM and probably use WASMGC, but maintaining a port of the entire existing C runtime and codebase is unrealistic (at least for the Ciao team). Our current approach is to compile the engine via Emscripten and provide some JS wrappers. That is the quick and easy way for C-based Prolog systems, done at least in SWI, Ciao and probably others. Interestingly, the performance of the ā€œdirect to JSā€ and Emscripten versions was similar (using JS objects as terms is much heavier than the compact and specialized data representation implemented by our Prolog engines). Some time ago, I discussed WASMGC with Andy Wingo. My understanding was that it is not easily usable from C (you cannot directly generate or use WASMGC garbage-collected reference types).

1 Like

Thanks for the response.

If I can break down some threads (sic):

SWI-prolog uses a compiler to a byte code machine (a la WAM?) and the byte code engine is written in C.

Some other languages follow this approach; most notably Python. This will of course work, with minimal effort.

However, the biggest issues will revolve around integration with JS and the browser. In particular, passing strings between wasm and js is extremely slow (the string is copied in js space one byte at a time).

In addition, one risks leaving some performance on the table: by compiling directly to wasm. OCAML generates WASM code directly. But the OCAML compiler is set up with multiple different back ends already.

Unfortunately, the wasmgc model does not seem especially suitable for Prolog. The two questions that are top of mind for me are the Bryunooghe-style GC and variable-variable bindings. The latter would need so-called interior pointers.

Go-lang also has a need for interior pointers.

One of my missions is collect use cases to potentially justify investment into supporting interior pointers.

Coroutining: we are currently working on bringing full coroutining to wasm. This should mitigate some of the issues that you report. There will be some limitations: there is no plan to support multi-shot continuations and you will not be able to suspend anything involving JS or browser internal code (e.g., you will not be able to suspend an event callback).

However, there is already something called JSPI which might eliminate the requirement to rely on worker threads. This is also transparent to wasm itself; and works by allowing wasm code that depends on Promises to be suspended.

JSPI looks very interesting, but only if it does not add a considerable slow down in the overall execution.

I am not sure if interior pointers would benefit Prolog, but if they are good for Go, it may be for us. There are two kind of Prolog data: 1) (logical) terms living on the heap; and 2) everything else (blobs, bignums, atoms, foreign objects, streams, code, etc.). For logical terms, you usually want to ship your own runtime and managed memory. But for everything else, a conventional GC may work fine.

1 Like

Seems it can implement e.g., read/1 easily and if the ā€œpromiseā€ ( :slight_smile: ) holds with low overhead. If I read it correctly it does not allow the WASM component to be active in multiple asynchronous flows (I would be surprised if this was feasible).

What I would like to have is way to deal with this: Prolog → C → Prolog and now the inner Prolog wants to yield. That works fine, so now we are back in C. But now I want to continue the yield to the outer Prolog call and be able to resume it. If that can be resolved without a global program transformation that slows down the execution (I’m ok if the involved C function needs a transformation) it would be fairly easy to resolve the remaining issues with the current WASM version of SWI-Prolog.

By prolog->c->prolog, do you mean wasm->host->wasm? Or is this all compiled into wasm (just different source languages).

If it is wasm->wasm->wasm then Jspi (and core stack switching) will work without any transformations. The inner will be able to suspend through all the layers of the sandwich.

If the middle layer is actually host code (operating system or browser code) then it’s much more difficult: it can be engineered via multiple suspensions but fundamentally we are not allowed to suspend host code or JavaScript code.

Sorry for the unclear picture. From WASMs perspective this is just WASM->WASM->WASM as running Prolog code means running PL_next_solution(), which is translated by Emscripten to WASM.

So yes, Jspi will help. At least to some extend. Waiting for a promise using Jspi will not allow me to switch to another Prolog engine, so there is no proper full coroutining. At least, that is what I think now …

Well, the picture is not so hopeless:

  1. You can use JSPI, especially for the cases where coroutining is used to access asynchronous I/O.
  2. JSPI is re-entrant: you can re-enter the wasm engine while suspended on another call.

From an overall engine perspective, the main thing that you will not be able to do is manage the suspended tasks with your own scheduler: you must use the browser’s event queue management. (Actually, this is a fundamental aspect of the web architecture.)

Where it gets less obvious with JSPI is when you want to use it for some other coroutining patterns: the classic ones being the generator and the ā€˜green threads’ patterns. If these are important use cases then you will be better off with so-called core stack switching. This is when concepts such as continuations, suspend and resume are built in to wasm itself. That is coming, but will likely not be widely available until next year. (You can try it today in Chrome but its definitely not shipped.)

The other consideration here (with core stack switching) is that C knows nothing about it. So, tools like Emscripten will not generate the right instructions. (OTOH, Binaryen (which is used by Emscripten) will be aware of it.)

BTW: on the issue of using wasmgc to handle atoms and clause code, there is a technical issue with wasm: you cannot have a reference to a wasmgc object in linear memory. So, you would have to supplement the linear memory with wasm tables. (This is a consequence of the security architecture of wasm.)

1 Like

Thanks. I’m not into the WASM version right now, but I always remember good leads :slight_smile: The overall design is now (if I recall correctly) that you can make calls to Prolog (in your language from JavaScript to WASM, calling PL_next_solution()) in two ways.

  • Run Prolog in the current engine
  • Run Prolog in a given engine. This can either be an explicit engine created before or a temporary engine, i.e., one that is created to run this goal and destroyed on completion.

There are some examples in SWI-Prolog WASM engine demo. The (JavaScript) code is embedded in the HTML.