Prolog's performance on "Billion Nested Loop Iteration"

I thought it was Clocksin/Mellish who originally implemented the FFT in our beloved language?

Sorry, I know I’m late to the party but

uses 2 features from procedural languages that prolog completely lacks:

  • loops for x in y

  • inplace assignment +=

Reminds me of this talk by Richard Feldman “Outperforming Imperative with Pure Functional Languages“ where he talks about optimizing quicksort in his language Roc so that it’s competitive with mutable languages like JS, Java and even Go, even though Roc is pure, functional, immutable, and obviously lacking in-place mutation. Spoiler alert, he did beat Go (according to his possibly cherrypicked benchmarks).

@abaljeu mentioned prolog could still be competitive if a complex optimization subsystem was implemented? Feldman’s solution at the end seemed quite elegant to me.

I’m curious if any more work has been done with this or how confident are we that such an optimization could be implemented without using mutation?

1 Like

I didn’t say this. I don’t know myself what’s possible with SWI. All I said was it currently is very slow compared to other languages.

I will point out that Sicstus Prolog had benchmarks that it was much faster than SWI, at the cost of being less flexible.

Apologies if that was an unfair paraphrasing. I’ll check out the Sicstus solution.

You forgot to mention that his Roc language is compiled to binary ^^
I have heard @peter.ludemann talk about a prolog compiler named aquarius.
I’m wondering if it could tackle this kind of task ?
I would love to know more about the history or annecdotes around this !

1 Like

Unless I misunderstood your question: there is a compiled Prolog: GNU-Prolog. It even has a somewhat clean interface to global variables and arrays. For example, to create an array with 10 elements initialized to 0, increase one element, then print the array:

$ gprolog
GNU Prolog 1.5.0 (64 bits)
Compiled Jul  8 2021, 09:35:47 with clang
Copyright (C) 1999-2024 Daniel Diaz

| ?- g_assign(a, g_array(10)),
     g_inc(a(3), _),
     g_read(a, X).

X = g_array([0,0,0,1,0,0,0,0,0,0])

yes

My naive attempt however seems to show that it is very slow. I assume I am using it incorrectly. Here is the code, a mechanical translation of this Java implementation that you linked:

:- initialization(run).

run :-
    loops(40, R),
    real_time(T),
    format('Result: ~w in ~w milliseconds~n', [R, T]).

loops(U, Result) :-
    random(0, 10000, R),
    g_assign(a, g_array(10000)),
    forall(between(0, 9999, I),
        ( forall(between(0, 9999, J),
            (   g_read(a(I), Old),
                New is Old + J rem U,
                g_assign(a(I), New)
            )))),
    g_read(a(R), Result).

Here is how I compiled and ran it:

$ gplc --fast-math --no-top-level gponebil.pl
$ ./gponebil 
Result: 195000 in 50675 milliseconds