First attempt at isolated transactions

Should these transactions be thought of the same as transactions in SQL databases?

Some descriptions of SQL database transactions that relate to what I think of as SQL database transactions:

TutorialsPoint
Wikipedia
c-sharpcorner

If so then will all the ACID rules apply? I take it that the answer is yes because the description notes

Transactions allow running a goal in isolation . Database changes from other threads are invisible while changes inside the transaction are invisible to other threads and become atomically visible to other threads if the transaction is committed .

I noticed that D for Durability is missing.

If so then I will use the Microsoft Northwind database which is like a Hello World! of SQL databases and has countless examples accessible via the Internet to do similar examples with SWI-Prolog. Also there is wide-world-importers which I just learned about and looks to have many transaction examples.

Related to this is the NoSQL database Neo4j from which I have learned a lot about SQL databases from their documentation, e.g.
Graph Databases for Beginners: ACID vs. BASE Explained
Neo4j’s Causal Clustering
Cypher transaction API

All of this could take some time so I will be stepping away from the Discourse self-hosting project while doing this. If these examples turn into something promising I will look at creating some Wiki topics for them. :grinning:

That fail/throw would do rollback is not yet evident for me
from pl-transaction.c. But will check back again soon.
Is rc=callProlog, the result value choosing the roolback?

Now I see, you wrote, so the rollback is implicit. My bad:

So the two queries translate to, provided Actions is det:

?- transaction(Actions).

?- transaction((Actions, Condition)).

Oki Doki. This also means in single threaded mode one cannot
optimize away transaction/1. In single threaded mode
transaction(Actions) and Actions are nevertheless not

the same. So the transaction/1 does already a little more
than only isolation? Nitpiking on terminology. How about
interrupts, thread signals? I guess they will also rollback.

As is, no. ACID (atomicity, consistency, isolation, durability) (Wikipedia).

Right now, we have atomicy and isolation. durability refers to persistent store (I assume) and as this is pure in-memory, this can’t be provided. Possibly it can be linked to library(persistency) to achieve that. This would require a hook that can make the pending commit actions persistent. That will be added at some point.

consistency is now only resolved when you serialize possibly conflicting transactions using a lock. We probably do want to address this at some point. As yet, I do not know how. One of the things I’m considering is to associate constraints to the DB/a transaction and have a serialized area that consists of three steps

  • locked
    • Move the visibility to the current global state + local modifications
    • Run the constraints
    • On failure, discard
    • On success, commit

I think that is pretty simple to implement. Updating the visibility inside a transaction is nothing more than setting a new start generation value. Ideas are welcome. There is of course a lot of prior work in databases. Some of this probably applies, with or without some adjustment to make it fit Prolog.

My main current focus is to make the isolation work properly. It isolates fine, but its cooperation with clause garbage collection delays reclaiming clauses too much. If I can make that work fast we have a nice and clean alternative for the commonly seen

  • do a lot of computation using the dynamic database
  • clean the database.

That is than as simple as below, which also makes it thread-safe without the need to know which dynamic predicates it uses. This is less efficient than thread-local predicates though as we end up with a single clause list of which each thread (snapshot) sees as different subset.

 snapshot(compute).
1 Like

Thanks,

Your models of how SWI-Prolog works are so much more extensive than my mental models.

Does a graphic exist of all these parts?


I don’t want to branch off into another subtask, (go down the rabbit hole), to create the graphic so I will go back to the self-hosting project but will revisit this from time to time as using SWI-Prolog database with ACID could replace a major task of my much larger project. I am also considering TerminusDB for my much larger project. :grinning:

1 Like

No. I draw a lot of pictures while designing this kind of stuff, but they are really use-once. A couple of days later I often don’t even understand my own drawing :slight_smile:

2 Likes

I wish more people would start with pen an paper when problem solving!

A while ago I added TIFF as a format to Discourse that can be uploaded so that the hand drawn images could be scanned then uploaded e.g. in this post is PostScript Number and Name object Syntax Diagram.tif

For others: I am not encouraging users to start uploading TIFF images as we have storage constraints to contend with at present but for certain cases having the image outweighs using the storage space; when we move to self-hosing the storage space increases greatly. Also remember that when creating a TIFF image, options are usually available to create a black and white image which greatly reduces the byte size of the file and judicial use of other options when scanning will also reduce the file size.

If the image is useful we could always convert it to the DOT language and use GraphViz which is magnitudes of size smaller.


Donald Knuth (ref) and Edsger Wybe Dijkstra (ref) often scanned their notes and made them available.

Yes could verify on my side, with the temperature style transaction,
the example you gave. Now was doing the fill_unsafe test for
transaction/1 wrapped. Of course there is again interleaving and

therefore inconsistent results. It took me some time to port
the code from store.p to storeswi.p. I made first an error
that get/2 didn’t use setup_call_cleanup/3 and then

check/1 didn’t find any inconsistency. In as far I think
with_mutex/2 is dangerously designed. But after finding
this bug, I could reproduce what I already reproduced for

my system. Namely I now find:

% ?- repeat, write('.'), flush_output, fill_unsafe, check(K).
% ....
% K = 30 ;

% ?- repeat, write('.'), flush_output, fill_unsafe_transaction, check(K).
% ..
% K = 4 ;

Quite amazing how quick these inconsistencies now arise.
I changed the range of random key and value generation
from 0…9 to 0…99. Happens now much quicker.

Source code is here:
https://gist.github.com/jburse/ff32d742af32a9d281fd6d18b7480441#file-storeswi-pl

Maybe the predicate should not be called transaction/1,
rather only isolation/1. Since it doesn’t implement
any kind of transaction.

From ACID it seems to only support I. Then later
isolation/1 could be part of various transaction
experiments. Like pessimistic or optimistic

transactions that combine isolation/1 with other
means and provide transactions this way. A drastic
pessimistic transaction predicate could then

be bootstrapped as follows:

transaction(G) :-
  with_mutex(lock,
     isolation(G)).

But isolation/1 cannot yet work together with
library(persistency). Since in case of a rollback
the SWI-Prolog isolate/1 works natively and

restores the clauses natively, without some notification.
There are no hooks called, so library(persistency) will
not be notified of the rollback, and will have already all

changes written into the database log, and will not write
some undo changes, also not do subsequently a
database log compression if some thresholds are met.

Your isolation/1 is snapshot/1 (I thnk). transaction/1 is currently AI :slight_smile: , atomic and isolation. It will get C (consistency), for which I already suggested one idea. It will also get D (durable) as this means little more than a hook in the commit. It may turn into transaction/2 with a option list or possibly other higher argument versions. The similar rdf_transaction/1 has proven to be a simple and usable primitive.

I think AI is already a quite valuable addition. Even single threaded it simplifies consistent updates that need multiple assert/retract and simplifies temporary use of the dynamic DB for computing something. It allows the global state of multi-threaded applications to be always consistent (provided the application adds logic to ensure competing transactions do not violate constraints). With a global consistent state we can use snapshot/1 to have a thread providing consistent answers about this state.

3 Likes

Nope it cannot do that. It just defers the global update
until the end of transaction. And this global state can then
be rendered inconsistent by the global update happening

once instead step-by-step. There is no always consistent.
I tryed your temperature example, it didn’t work:

put_unsafe_transaction(K, V) :-
   transaction((retractall(store(K,_)),
       assertz(store(K, V)))).

Thats just your temperature example with an extra key:

update_temperature(Temp) :-
    transaction(( retractall(temperature(_)),
                  asserta(temperature(Temp)))).

Why do you think the global state should be always consistent?
What do you mean by “provided the application adds logic
to ensure competing transactions do not violate constraints”,

can you make an example? Is a lock application logic?

The docs already say a transaction is not sufficient. In this particular case you can simply do this. Using this, all threads see exactly one clause for temperature/1 at any time. If this were a realistic domotica application there would probably be only one thread listening to the temperature sensor and we could do without the lock. The big pro remains: all the agents that need to use the temperature do not have to worry there is no temperature at some point or there may be two of them.

update_temperature(Temp) :-
    with_mutex(temp,
               transaction(( retractall(temperature(_)),
                             asserta(temperature(Temp))))).

Working on a real use case with CMU, we decided to add transaction(Goal, Constraint, Lock):

                                                    Availability: built-in

transaction(:Goal, :Constraint, +Mutex)
    Similar to transaction/1, but allows verifying Constraint during the
    commit phase. This predicate follows the steps below. Any failure or
    exception during this process discards the transaction  and releases
    Mutex  when  applicable.  Constraint may  modify the  database. Such
    modifications follow the semantics that apply for Goal.
      • Call once(Goal)
      • Lock Mutex
      • Change the visibility to the  current global  state combined
        with the changes made by Goal
      • Call once(Constraint)
      • Commit the changes
      • Unlock Mutex.

This deals with cases where Goal takes considerable time and conflicts are not common. It allows multiple threads to start their transaction. At the end, the Constraint and commit phase are locked and the Constraint shall only succeed if the database is consistent. This makes us ACI :slight_smile: .

In fact, it also allows for ACID as the constraint can be misused to make sure the changes are stored in a durable manner. There is transaction_updates/1 that allows the anything running inside the transaction to get a list of the updates a commit would do.

Note that the above also allows writing the temp example as below.

update_temperature(Temp) :-
    transaction(true,
                ( retractall(temperature(_),
                  asserta(temperature(Temp))),
                temp).
2 Likes

But this here:

update_temperature(Temp) :-
    transaction(true,
                ( retractall(temperature(_),
                  asserta(temperature(Temp))),
                temp).

Is the same like:

update_temperature(Temp) :-
    with_mutex(temp,
                ( retractall(temperature(_),
                  asserta(temperature(Temp)))).

I was just following your description and the fact that “true” does not do
any changes and that the condition will never fail. So transaction/3 was
not essential here, could be done already before. Are there any

examples where transaction/3 or transaction/1 are essential?

I tend to think of this as when two-phase commit is needed. Most the time you use transactions as a safe guard in case something fails. When working with database transactions I can’t recall a transaction ever failing in production on a single system. With messaging systems that do store and forward, they wrap the forward part in a transaction to ensure the message makes it to the next hop. When you have many systems connected doing this, then it was not uncommon to see the forward part fail but because it was wrapped in a transaction the store part was considered a safe state to fall back. In a period of time the forward would be attempted again and often succeed.

the two-phase commit (2PC) protocol should not be confused
with the two-phase locking (2PL) protocol.

I don’t see that we are talking distributed transactions right
now. Namely that for example a transaction starts at some
machine A, continues on some machine B and then comes
back to machine A, and continues on some machine C,

and then comes back to machine A, and now machine A wants
to commit and wants consensus from B and C as well.
Currently the multi threading only refers to the fact that multiple
threads perform some critical action on the same machine.

The criticial action is retractall/1, assertz/2 combo which
doesn’t cause some problem single threaded. Atomicity would
mean that retractall/1, assertz/2 combo should not be
split into separate actions. So when two threads perform

atomically this combo:

   Thread 1                       Thread 2
   retractall/1                   retractall/1
   assertz/1                      assertz/1

Then the result should be a serialization, the braces > showing
where atoms, i.e action combos, were kept intact:

                retractall/1  \
                assertz/1     /                 
                retractall/1  \
                assertz/1     /

Thats the meaning of A in transactions and what atomicity means.
Thats the yardstick for atomicity when the isolation level I is serializable.
For lower isolation levels I some reordering would be allowed.

Obviously trivially we archive serializablility with this:

update_temperature(Temp) :-
    with_mutex(temp,
                ( retractall(temperature(_),
                  asserta(temperature(Temp)))).

The predicate with_mutex/2 is the simplest 2PL protocol. Since in
SWI-Prolog there are no read-write locks (shared/exclusive locks),
not much more can be done. The example is also not so interesting

for more advance locking since it does more or less two writes. Dunno
yet a better example, thats why I was asking.

You do not seem to get the point of transactions :slight_smile: There are three major differences. (1) If anything goes wrong during the sequence of (related) database updates, transactions guarantee the partial update is discarded while with_mutex/2 leaves the database in an inconsistent state. (2) The reader also needs to use with_mutex/2, using the same mutex as otherwise the retractall/1 may have completed, but the asserta/1 is not yet done. For the transaction based version this is not needed. (3) writers and readers can progress with much less locking. Notably readers can use snapshot/1 to get a coherent database at any time without locking writers.

2 Likes

Logically, transaction/3 is not needed. It improves concurrency in the case we have a long running goal inside a transaction and a fairly cheap way to check that the updates inside the transaction do not conflict with the current global state.

This comes from a practical use case where concurrent transactions are used that perform what if reasoning. I.e., they change the database, use incremental tabling to explore the consequences and if these are satisfactory they can commit, unless some concurrent transaction has already created a conflicting state. The constraint is used to verify whether or not this is the case. As all this is very new, we are still in the explorative state :slight_smile:

The tests contain a pretty useless example, doing the retractall/assert as goal and a constraint that says there must be exactly one clause for the temperature. Now we have two threads setting the temperature in a loop to random values and a third reading the current temp in a loop and verifying there is exactly one temp. Unfortunately this test sometimes fails, so I need to do more debugging :frowning:

1 Like

In this case it is not the best wording. The logic of database changes and thread synchronization is, I guess, not that easy to describe. If we do

with_mutex(x, transaction(Goal))

and Goal is designed such that it brings the database from one valid state to the next we have a globally observable database that atomically jumps from one valid state to the next. That is nice and fairly simple. The downside is that there is no concurrency, which is fine if Goal is a mere series of assert/retract operations, but not so fine if Goal is slow.
So, transaction/3 runs Goal concurrently in each thread doing the transaction. It then guarantees that if Constraint only succeeds if the database is in valid state at the end of the Constraint, the globally visible state is always valid. Note that Constraint sees the database in the state it will be after the commit.

An yes, you can do what-if reasoning without transactions, but transactions make it a lot easier.

A simple form of optimistic locking, which abstracts from constraints, and
only looks at whether or not another thread changed something related to
the current thread, was for example offered in Hibernate.

You store a version field somewhere together with the data. In each
clause. And upon commit you check whether nobody else did a
modification of your data, that you were about to change, and rollback

if there was a conflict. This is a form of Optimistic concurrency control (OCC).
OCC can be subject to subtle ABA problems. Like it can be that you
change a foreign key, but the target objects do not get versioned.

So basically a simple table:

:- dynamic foo/1.

Blows up into, extra column for the version and thread local for the workload:

:- dynamic foo/2.
:- thread_local foo_cache/2.

You can get this working. For this simple case it won’t be better than just a lock around the transaction, but this is what I would do:

increment_counter(Delta) :-
    transaction(( retract(counter(Value)),
                  Value2 is Value+Delta,
                ),
                ( counter(Value),
                  asserta(counter(Value2))
                ),
                counter_lock).

This is close to the compare-and-swap technique used for lock-free programming: optimistically prepare a new value and at the end replace the old value if the value has not been changed. You can also swap this around. Both assume there is always just a single counter/1 clause (and maintain this invariant). I think this is a little more elegant as it is closer to the compare-and-swap and because a possible conflict causes an empty rollback.

increment_counter(Delta) :-
    transaction(( counter(Value),
                  Value2 is Value+Delta,
                ),
                ( retract(counter(Value)),
                  asserta(counter(Value2))
                ),
                counter_lock).

Needless to say, this only makes sense if the Goal (first arg) is a more expensive computation and conflicts are relatively rare. This is a quite nice example of one way to use transaction/3 though :slight_smile: I think I’ll add it to the manual.

1 Like

I understand this is just an example, but if it is just a counter, wouldn’t you simply use nb_setval/2?