Pushed library(intercept) (Discussion)

Hi Jan,

That looks great.

I guess, that sending the signal is fully asynchronous …

Its great to create,for example, log files, and to ensure that writing to a slow output device, doesn’t slow down processing.

Dan

Rereading, i notice that this is not a publish-subscribe interface …

its a dedicated intercept … could this be made to work so that many subscribers (observers) could listen to the signal and act on it …

Dan

That is done by library(broadcast). This too is synchronous, so if you want it asynchronous you have to listen on a channel and send all events to a thread message queue. Note that this library also integrates with several network layers such as TIPC and UDP.

1 Like

SWI-Prolog provides a publish-subcribe broadcast library.

Logtalk provides publish-subcribe via event-driven programming support. There’s also a dependents library implementing a version of this Smalltalk publish-subscribe mechanism.

Thank you @jan, this is very interesting. To me it seems that this finally resolves the question of disentangling the parsing from the side effect, on the level of the source code at least. I am talking about this question that I had.

I now re-wrote my original client code, from this:

:- use_module(fasta).
:- use_module(iupac).

main(_) :-
    phrase_from_stream(fasta_revcomp, current_input).

fasta_revcomp -->
    fasta_record(Descr, Seq),
    {   reverse(Seq, Rev),
        maplist(iupac_complement, Rev, RevCompl),
        phrase(generate_fasta_record(Descr, RevCompl), Codes),
        format(current_output, "~s", [Codes])
    },
    !,
    fasta_revcomp.
fasta_revcomp --> [].

… to this:

:- use_module(library(intercept)).
:- use_module(fasta).
:- use_module(iupac).

main(_) :-
    intercept(phrase_from_stream(fasta, current_input),
              fasta(D, S),
              (   revcompl(S, RS),
                  phrase(generate_fasta_record(D, RS), Codes),
                  format(current_output, "~s", [Codes])
              )).

fasta -->
    fasta_record(Descr, Seq),
    !,
    { send_signal(fasta(Descr, Seq)) },
    fasta.
fasta --> [].

revcompl(Seq, RevCompl) :-
    reverse(Seq, Rev),
    maplist(iupac_complement, Rev, RevCompl).

I like it better like this. I timed the two versions on my original ~30MB, ~20K record input file, and I did not see any difference, which is great! If there is overhead, it is negligible in comparison to the real processing.

(Note: I changed the DCG for parsing FASTA from fasta//1, parsing to a compound term fasta(Description, Sequence) to fasta//2 that has two arguments. This seemed to cut ~5% of the running time, but I didn’t do the timing measurements too carefully.)

I have further questions but I first need to try out how this can be used and broken :slight_smile:

2 Likes

Yes! This was about the scenarios the Ciao developers from what I understood from Edison Mera. I’m thinking to apply this stuff also to library(csv) and the various RDF parsing libraries. The overhead of intercept is significant, but in most practical applications this shouldn’t be in the inner loop. On a tight loop using between/3 as generator I could measure a 10 fold slowdown. Some of the overhead can be reduced by pushing the search for the intercept handler and copying the match to C. Seems this isn’t immediately necessary :slight_smile:

forgive my cluelessness here, but how is this not just library(broadcast)?

1 Like

The most outstanding difference is the scoping. intercept is scoped to a goal, where broadcast listeners are globally scoped. There are also differences wrt the semantics of the called handlers, but these are more arbitrary. A broadcast channel has 0 or more listeners and any number of them is fine. An intercept channel is typically handled by a single handler and lack of a matching handler is normally considered an error.

2 Likes

So, I’m writing a parser for BVH files, and wondering what to do with syntax errors. Some users might expect an exception thrown, some might want to output error messages. Error message folks want to know about subsequent errors Depends on if the bvh file is something they generated, or if it’s an outside file that is valid or isn’t (and even then, the bvh format is ‘de facto’, so some oddball variation might need tweaking.)

Better to use intercept, or better to write error messages and fail, or?

It depends a bit on what some people means. If this means different applications a simple print_message(warning, ). might do and some applications may wish to hook this to generate an exception. If it is the same application the intercept interface could be appropriate. It can be used to choose between throwing an exception, print-and-skip, simply skip or get some dedicated term in the output and let the user deal with it later.

1 Like

probably different applications. So you see throwing from the message definition legit?

From message_hook/3, I guess that is fine.

1 Like

Isn’t intercept a Lisp-like condition system as implemented in https://www.swi-prolog.org/pack/file_details/condition/prolog/condition.pl - with Restart ignored?

They are surely related. Thanks for pointing that out. The condition package seems to have two modes though: one where it applies to child goals. This one is broken if Goal is non-deterministic as the handler is removed only if the choice points are exhausted. The global one looks more related to library(broadcast).