Array library with a C backend

Hello,

I always wanted to have a multidimentional array library (something like numpy) for prolog.
So I started working on a proof of concept here: GitHub - kwon-young/array: swi-prolog array library

Here is a list of features from the top of my head:

  • Nd Array creation:
    • with an empty buffer
    • with ones or arange (notably these won’t be allocated)
    • from a prolog list
  • reshaping
  • indexing and fancy indexing
  • transpose
  • broadcasting
  • basic math operation: +, -, *, /, sum and matrix multiplication with the operator @
  • all array operation is done using the affectation operator @=, similar to the #= operator in clpfd
  • actual computation is done in C, so should be faster than prolog, but is still much slower than specialized compute library like BLAS

Here is an example code of how you would do a matmul with this library:

?- Size = 1024,
   A @= array{name:a, dtype: float64}.empty([Size, Size], [zero(true)]),
   B @= array{name:b, dtype: float64}.empty([Size, Size], [zero(true)]),
   C @= A @ B,
   realize([C]).

or a more complex chain of operation:

?- C @= array{}.arange(12).reshape([3, 2, 2]).sum([0, 2], false),
   R = C.to_list().
C = ...
R = [27, 39].

My main motivation here is to write everything from scratch and try to leverage prolog specific strength as much as possible along the way. So, here are some of my design decisions:

  • This library implements the strided array abstraction (like numpy), meaning that multidimentional arrays are backed by a flat buffer and iterated through using the shape and stride information.
    • interestingly, all shape and strides computation is done through clpfd, making this a mostly logically pure implementation of the concept of strides. This means that you can do test/generate/search over strides and shapes of a computational graph.
  • The design is mostly inspired from tinygrad where the operations are lazily defined to form a graph. This graph is then compiled in C and executed through the predicate realize/1.
    • Using prolog was a really pleasure when working on the generation of the computational graph. I can just nest dicts inside dicts and it works really well for now.
  • I use swi prolog dicts for representing an array and it works really well for this. It even allows me to have a functional api to define operation on arrays like sum.
  • The implementation features a state-of-the-art static allocation algorithm called minimalloc. I’ve spent weeks trying to implement it perfectly, hence the effort for the documentation.
  • In order to easily generate C code, I have written a god forsaken hacked up C code generator, which I’m both proud and ashamed of…

Unfortunately, this PoC is quite unreadable, full of bugs and extremly slow and completely undocumented. And I’m starting to loose a bit of steam for this.
So if anyone is interested or have question, let me know so that I keep working on this.

In the future, I want to add autograd for doing deep learning and an opencl backend for heterogeneous computing (gpu computing in prolog!). If anyone wants to help reach these goals, let me know, here or in a issue in the github repo.

2 Likes

So, two reasons:

  1. This project is strongly inspired from tinygrad where they have this reimplement everything from scratch philosophy
  2. The minimalloc algorithm was too beautiful, too simple to not implement in pure prolog.

Currently, this project has zero dependencies to install apart swi-prolog and gcc

1 Like

I implemented an exploratory pack(arithmetic_types) a couple of years ago. The objective was to see if standard arithmetic could be extended to user defined types (booleans, lists, strings, etc.) ; one of the types was nd_array implemented as compound terms with functor #. You can index and slice them using square bracket notation and the pack supports a fairly complete list of operations (inverse, transpose, determinant, ineer and outer products, etc.) Some examples from the ReadMe:

?- A is ndarray([[1,2,3],[4,5,6]]).
A = #(#(1, 2, 3), #(4, 5, 6)).

?- A is new(ndarray,[3,3]), A[0,1] is 42, X is A[0:1][0][0:3-1][1].
A = #(#(_11938, 42, _11942), #(_11916, _11918, _11920), #(_11894, _11896, _11898)),
X = 42.

?- T is transpose(ndarray([[[1, 2], [5, 6]], [[3, 4], [7, 8]]])).
T = #(#(#(1, 2), #(3, 4)), #(#(5, 6), #(7, 8))).

?- S is sum(ndarray([1,2,3])).
S = 6.

?- CrossProd is cross(ndarray([1,2]),ndarray([4,5])).
CrossProd = -3.

% Product of a 3x3 matrix and its inverse is the 3x3 identity matrix
?- A is ndarray([[3,0,2],[2,0,-2],[0,1,1]]), I is inverse(A), X is dot(A,I).
A = #(#(3, 0, 2), #(2, 0, -2), #(0, 1, 1)),
I = #(#(1r5, 1r5, 0), #(-1r5, 3r10, 1), #(1r5, -3r10, 0)),
X = #(#(1, 0, 0), #(0, 1, 0), #(0, 0, 1)).

Obviously not going to as fast as a C implementation, but that wasn’t my objective. “Hooking” it into the existing functional arithmetic (see arithmetic_function/1 directive) also inccurs some overhead. But the pack may give you some ideas that you can use in your project.

The other “feature” I was interested in was to convert instantiation errors in numeric expression evaluation into clpBNR constraints when that module was available. Support for array operations on real intervals permits, for example, the use Cramer’s rule to efficiently solve linear systems of equations (see example in ReadMe).

Thank you for your comment.
Of course, I knew about your nd_array library.
It helped me implement the block [] notation.

I also wondered about hooking into standard arithmetic instead of using a special operator @=, but I found it too difficult.
I’ll retry in the future though !

You could try using the arithmetic_types module from the pack and implement your own “type” using the other types in the pack as teaching examples.

Basically you have to use the arithmetic_function/1 directive to “publish” the implementation in your module. To support overloading, the implementation must fail if the arguments are not the expected type - that allows for other implementations of the function on backtracking. The rest of the details are buried in the arithmetic_types module (non-trivial, but I mostly just used the design and code from library(arithmetic)).

You could also build your own standalone implementation using @= and (optionally) wrap it as described above to “hook” the standard functional arithmetic. (I’m happy to participate if that would help.)

It would be useful to me to have a somewhat compatible, independent Prolog implementation of arrays so some agreement on functions/operators for the “meta-language” for array expressions would be beneficial. (I think I largely depended on numpy, but it’s been a while.)

1 Like

This option seems the best to me. I’ll try to do something with this and will ask if I need help.
Since you offered some help, can I ask your review on github if I manage to make a PR ?

So, there is the python standard array api which we could follow ?
I also based my own functions on the numpy api.

Ah after inspecting your type_ndarray.pl, I think I remember why I did not go that route.
It seems I can’t reuse the standard +, -, * and / operator for arrays.
Instead, you used .+, .- etc and I don’t want to loose standard operators.

1 Like

arithmetic_types extends the semantics of the standard functional arithmetic. In doing so I didn’t want to impact normal usage by overloading any of the existing arithmetic functions, so new operators were necessary.

If you’re going to separate standard arithmetic from array arithmetic, e.g., using @=, then this isn’t a concern. But then any wrapper to implement a “type” will need to map operators. Maybe that’s a dealbreaker for you.