Hello everybody,
Some times ago, I’ve stumbled on the QOI image format.
The interesting thing about this image format is that you can read the specification in about 5 min and write a decoder in about the same time ^^
It is also a format that is optimized to be very fast while getting a quite ok compression.
My first thought after reading this was to write a decoder in prolog and so I wanted to share my attempt here: qoi-prolog/qoi.pl at 3a29b9d4dff8a300e14d359a7bc00a9b22152f53 · kwon-young/qoi-prolog · GitHub
One thing to note is that I have “tried” to optimize the decoding for speed while keeping readable code, so here are my design decisions:
- use a DCG for readable code
- pack the decoded pixels in a large
data/N
compound of arityN
pixels, where each element is a compoundcolor(R, G, B, A)
. - use tail calls for both the tail call optimisation and code readability
- the qoi format requires the use of a mutable array of 64 colors. I use a compound
colors/64
together withsetarg/3
to mutate the array through the decoding
In term of speed, I am ~100x slower than the reference implementation in C when run with swipl -O
optimized flag… which is not great but not too bad either ?
After profiling the code, I don’t really know how to optimize it further…
I have included some test images in the github repo:
?- time(qoi_decode(_, "qoi_test_images/IMGP5493_seamless_2.qoi")).
% 6,245,272 inferences, 0.859 CPU in 0.870 seconds (99% CPU, 7267876 Lips)
true.
So, around a 0.8 seconds to decode a 1024x1024
image.
I would be very interested if anyone can come up with more tricks to speed up the decoding or propose a completely new approach ?
Here are some questions I have:
- I have heard that in C, you can use tail call optimisation to completly remove the overhead of calling a function, does it hold true in prolog too when tail calling a predicate ?
- Does the code size of a predicate matter ? Again, I heard that it is simpler to optimize simpler functions than bigger ones in C.
- Should I use another representation to pack my pixels ?
Any comments are welcome
PS: here is a very interesting blog related to parsing protobuf using tail calls: Parsing Protobuf at 2+GB/s: How I Learned To Love Tail Calls in C