I am thinking to use global backtrackable variables to store dynamic information across predicates, thereby simplifying the argument structure of predicates.
At the same time, I am looking for opportunities to use multiple threads and understand that currently global variables are stored for each thread – i.e. have their own name space.
I guess what this means is that to pass information taken from global variables into threads, requires to read their contents and pass it as argument the the “threaded” goal, and then upon return of the threaded goal, to set the global variable with an obtained return value.
Is this the way this is supposed to be done? Any limitations this could cause?
The alternative is to not use global variables and pass global information as arguments …
any thoughts are much appreciated,
Edit: I guess one key issue could be memory overhead when larger structures are read and (re)set from/to global variables. In structures that are passed as arg, this is, i guess, not happening.
Global variables are represented using a thread-specific hash table pointing at terms on the stack (unless the value is a simple atom or small integer, in that case it simply hold the value). As it contains pointers to the runtime stack, there is no way global variables can be shared among threads.
In the current implementation the variables themselves are pretty cheap, but they do slow down garbage collection as it requires some ugly preparation and finalization to make them play with GC. The cost of this is proportional with the number of global variables.
backtrackable global variables are cheap in memory: just an entry in a hash table. Their non-backtrackable sisters require a copy of the term (after which the original may become subject to garbage collection).
SWI-Prolog (and hProlog from which I copied them and GNU-Prolog and most likely some) global variables are rather different then assert retract. As they remain on the stacks, they use stack space, but they can also be accessed at constant time rather than requiring to be copied to the stack as a dynamic predicate will do. If they are non-ground, we can further unify the term (value) and even if they are ground we can use SWI-Prolog’s destructive assignment operations on the term to modify it. Also for simple single constants, they do not require an assert+retract and subsequent clause garbage collection to get rid of the retracted clause.
I’m not saying they are better. Bart Demoen and Tom Schrijvers needed these things to get CHR ported. They can be useful.
The overall idea of (SWI-)Prolog multi-threading has always be to allow multiple lightweight agents to cooperate. I consider with_mutex/2 an emergency solution. As is, you now need it for updating a shared dynamic database if a single update consists of multiple assert+retract calls. It is not a nice interface though. The fact that it cuts possible choice points of the goal is an indication thereof. A much cleaner solution would be properly atomic and isolated transactions. One day these should be implemented (most of the work has been done).
The notion of message queues fits (IMO) Prolog much better. It looks like I/O though and Prolog isn’t too good with that either :(. You can use the lazy list library to see the input queue as a list
The set_flag/2 interface provides atomic shared updates for a simple value. I don’t think it uses CAS, though it could.
Agree. The new stuff here isn’t that impressive either though. The POSIX thread API exists for ages with only a few later additions. Atomic instructions also exist for ages. The only good news is that in the old days every CPU had its own shared semantics and you often had to access this proprietary stuff using asm(...). That was pretty much a nightmare. Nowadays most low level languages provide abstract access to a standardized shared memory model, which makes this all a lot easier to use (and SWI-Prolog uses a lot of this under the hood).
synchronized is a fairly simple abstraction of mutexes. Should be easy enough to put in a library. The fact that it would implicitly cut choice points is scary though. The alternative is this, but the risk freezing your application if a choice point is accidentally leaked is scary as well.
I just spent a few brain cycles on that, even getting to a start of a library, but I come to the conclusion it is mostly wasted time. I think that proper database transactions is the first and most needed step for cleaner semantics to shared dynamic predicates.
Something I’ve experimented with, and seems to work, is declaring a key(Value) I want to share between threads as dynamic/1 instead of thread_local/1, and then use assertz(key(Value)) within one thread to make it available to other threads.
The other threads needs to check the key(Value) they are trying to read has been set, else keep trying after whatever timelimit if it hasn’t.
The problem of changing the value with retract(key(Value)) with multithreading is one I’ve avoided.
At the moment I’m busy working through Joe Armstrong’s Erlang book, where the equivalent of the Prolog clause store is an ETS table, and his advice is to always put values that need to be shared in whatever compound data structure is used to message between threads. Luckily, I generally find I can do that.
Polling is typically a bad habit. You either have a far too long interval, not making progress or too short, wasting CPU cycles. This normally calls for a condition variable, but we do not expose these. Typically one waits using a message queue, but that doesn’t work very well in this case as when multiple threads wait on the queue, sending a message to the queue will only wake one. The GIT development versions allows asking how many threads are waiting on a queue, so you could send as many messages. This is a little dubious though.
Ciao, and Qu-Prolog have two things that are interesting. Ciao allows suspending on a dynamic predicate, causing a wakeup if some thread adds a clause. Qu-Prolog has a primitive to wait for any change to the (dynamic) database. I wonder whether we should implement some of this. The Qu-Prolog one could probably be refined by waiting for a module, so you can keep a set of relevant dynamic predicates in a module and wait for any of them to change. The Ciao Prolog one is quite closely related to SWI-Prolog message queues, but in this case does allow multiple threads to wakeup on a single event.
Could you elaborate a bit more on the use case so we can evaluate whether one of the above is sufficiently important to include it?
Could you elaborate a bit more on the use case so we can evaluate whether one of the above is sufficiently important to include it?
I don’t really have one. I’ve just dabbled a bit with this as a learning exercise. I’m generally in the “shared nothing” school of message passing where information common to all threads goes into a database and it doesn’t matter about the order of reads and writes.
The reason I feel the need for warning against changing shared data is whatever the current value a given thread sees is going to be hard to guess and will change between test runs, so immutability simplifies concurrency a lot.
I think the two concerns are mostly independent. Locking is in my view mostly a bad idea and transactions are a more natural fit with Prolog to deal with some of the synchronization ideas.
Using the dynamic database as a shared blackboard is definitely not new, but poorly supported in SWI. A waiting primitive should resolve that. Transactions are mostly orthogonal to this, but may play a role in avoiding wakeups in the middle of related database changes (a bad idea). One of the best things about transactions is that they allow for a consistent world view for other threads without locking.
I actually find SWI Prolog’s threads and message queues easier to understand than Erlang’s Node ! Message, receive ... end syntax which by abstracting away details, means I have no idea how to get lots of processes to listen to the same message queue (with whichever one is free and quickest handling the job) in Erlang as can easily be done in SWI Prolog for concurrent maplist.
The Erlang philosophy is to simply ignore all the mutex stuff, and since I don’t understand it anyway, that’s the route I’ve taken. I’d assume there must be some way for a thread updating shared data to let the other threads know by messaging.
Sure. So is “never use cuts”, in most languages “never use goto”, etc. As with most mantras, they generally make sense, but sometimes it is a good idea to violate them