Hello,
I’m back again with another completly pointless endeavour of implementing a pure DCG of the Quite Ok Image format: qoi-prolog/qoi_pure.pl at 4a82f1b5493522fb6fc133638a00b98eed39ce31 · kwon-young/qoi-prolog · GitHub
For context, I have previously made a post about a prolog decoder/encoder while trying to be as fast as possible and found that I was around 100 times slower than the reference C implementation.
Then, I couldn’t stop asking myself:
- Is it possible to write a fully logically pure implementation of the format ?
- How slow would it be ?
By the way, here is my (totally made up on the spot) definition of a “pure implementation” :
the exact same prolog code can be used with any instantiation pattern to do the same logical task in forward, backward, generative or test mode.
As helper library, I use clpfd (of course) and a reified unification predicate =/3
that I ripped out from the library reif
.
Here are some thoughts and comments on the experience of writing this:
- Writing pure prolog code is much much more complicated than writing the imperative style decoder/encoder. If I did not write the encoder/decoder separatly before, I would not have been able to write the pure version. I think the saying
the whole is greater than the sum of its part
is true here and applies to the notion of complexity ^^ - This endeavour shows that it is possible to write a logical pure version of this type of programs, although I think it also shows why we should not do it
- The pure implementation is roughly 1000 time slower than the non pure prolog implementation, and in turns means that it is 100_000 time slower than the reference C implementation ! I measured a whopping 5 kilopixel/sec decoding speed and 2 kilopixel/sec encoding speed
- As expected, from naive profiling, clpfd various arithmetic and reifying predicates are the bottleneck of the grammar.
- The code can be used to decode/encode but also to generate an image (leave the raw data and encoded bytes uninstantiated) and of course test an image and its encoded version, except you only have to test a single implementation !
- One of the great factor that increased complexity was to not leave any superfluous choicepoints in encoding/decoding modes since leaving a choicepoint per pixel is essentially synonyms with a memory leak. It will also breaks tail call optimisation which also means stacks will overflow, even for really small images.
Here is one question I have in mind now:
We often says that pure prolog programs is like a logical specification of a problem with citations like “Algorithm = logic + control”.
Would it be possible to take such pure prolog implementation and derive encoder/decoder programs that optimise for speed depending of instantiation pattern of the query ?
So, something like parser generators (bison, flex, yacc…) but for prolog code ? maybe even do it on the fly ?
I know that clpfd does some compile time expansion to optimise a bit the arithmetic when input variables are ground, but I’m talking something like runtime expansion based on the instantiation pattern of the query ?