Multiple threads and tabling issues

I’ve been trying to convert my single threaded application to be multi-threaded and the multi-threaded one isn’t working. Before I go through all the work of trying to build a simple repro case, I want to make sure my assumptions about tabling are correct.

The pattern of usage in my app is that a single user action causes a set of queries to run until one of them succeeds. Previously, I ran them serially, but I’m hoping that I can just run them concurrently and find the one that succeeds faster.

The queries can modify data, so I run them in a transaction and use a mutex to ensure that only one of them succeeds (and potentially commits data changes). If one has already succeeded, the subsequent queries fail and rollback any changes they made. To do this, I run the queries using the run_multiquery/1 predicate shown below (initialization predicates omitted):

:- dynamic successful_multiquery/1.

...

% Called before running a series of multithreaded queries
reset_multiquery_success :-
    retractall(successful_multiquery(_)),
    assert(successful_multiquery(false)).

% Each query is run wrapped in this predicate which only 
% commits the first successful query
run_multiquery(Goal) :-
    term_variables(Goal, VarList),
    transaction((   findall(VarList, Goal, Solutions),
                    first_multiquery_success
    )),
    member(VarList, Solutions).

% Fail if a successful query already ran
first_multiquery_success :-
    with_mutex( perplexity_multiquery_success,
                (   successful_multiquery(false),
                    retractall(successful_multiquery(_)),
                    assert(successful_multiquery(true))
                )
    ).

I’ve got a pretty extensive set of tests that run for over an hour running queries against the application. Until this branch, they all ran successfully as a single thread from Python using the MQI (and the swiplserver python library).

Problem: My problem is that some of my tests fail in the multithreaded configuration in a way that indicates that updates to the dynamic predicates are not invalidating the tables used by other threads. Just to reduce variables, I have also disabled concurrency in my app for testing. So, at the moment, only one thread at a time is doing anything (and it is still returning improper results from tables).

If I precede each query with abolish_all_tables, !, <query>. it works fine (but really slow).

My understanding and usage of multithreaded tabling

In my app, any of the threads can be running queries that are reading data and/or updating dynamic predicates that all of the threads query against.

I’m using only “regular tabling” (i.e. no options) or incremental tabling and none of the other variants. They are all explicitly set to private. I have used [incremental(true)] on the dynamic predicates that underlie all the predicates as shown below as well.

:- table isWord/1 as private.
:- table specializesSet/2 as incremental, private.

...

:- dynamic([data/4], [incremental(true)]).

I thought this would be the safest option to start with as it meant that each thread gets its own tables and the incrementally tabled predicates would get invalidated properly even if another thread updated the dynamic predicates. Is this correct?

I am also assuming that transactions and tabling play well together, meaning: tables across all the threads remain consistent with the current view of the data that that thread sees (i.e. uncommitted transactions on one thread don’t get tabled by another, committed ones do, the table in a thread gets invalidated if the transaction in the same thread rolls back, etc). Is that a good assumption?

I think this does not work (if I read it correctly). As a thread claiming to be the first will not see the possible assert(successful_multiquery(true)) of some other thread, they all think they are the same. You can do that using transaction/3, which runs a global constraint in what would be the final situation if the transaction is committed. The mutex argument ensures now two threads do this at the same time.

I fear no. The thread updating the dynamic predicate has only knowledge of its own private tables and the shared tables. It cannot access the other private tables and thus cannot invalidate them.

That may also be a little optimistic. Some scenarios work, but there are probably others that do not. It has been a while ago that I looked into this. We already planned a discussion, so this can be added to the topics :slight_smile:

1 Like

Ahh, good point. I should be checking the value that says if something succeeded outside of the transaction so that it is globally visible. I updated the predicates below for posterity…

[edit: overlooked the failing query case in run_multiquery/1…]

:- dynamic successful_multiquery/1.

...

% Called before running a series of multithreaded queries
reset_multiquery_success :-
    retractall(successful_multiquery(_)),
    assert(successful_multiquery(false)).

% Each query is run wrapped in this predicate which only 
% commits the first successful query
run_multiquery(Goal) :-
    term_variables(Goal, VarList),
    transaction(    (   findall(VarList, Goal, Solutions),
                        Solutions \== []
                    ),
                    first_multiquery_success,
                    perplexity_multiquery_success
    ),
    member(VarList, Solutions).

% Fail if a successful query already ran
first_multiquery_success :-
    successful_multiquery(false),
    retractall(successful_multiquery(_)),
    assert(successful_multiquery(true)).