Quasi-quotation API extension?

I have to admit I don’t fully understand all the ramifications of quasi-quotations, but:

The API for quasi-quotations does not obviously allow for the content to be a normal string. Instead it seems to be some opaque term created by the builtin parser.

Would it be possible/desirable to extend this API so either the content could be a string, or to support a predicate to convert a normal string to such an opaque term? In essence, is there a way to decouple the builtin Prolog parser from the individual QQ parsers so QQ content could come from somewhere else?

1 Like

The quasi-quotation can be a string. See this comment from @jan in a PR discussion, in response to my musing about having a “raw” string in Prolog, similar to Python’s r"...":

For long and complicated regular expressions quasi quotations are an option. We already have the string quasi quotation. Making a dedicated one for regex would improve things (we can put the options in the re term, e.g.

 {|re([caseless(true)])||aap|}

Source: Prepare existing PCRE1 code for migration to PCRE2 by kamahen · Pull Request #3 · SWI-Prolog/packages-pcre · GitHub

Well that would be good news. But then I would expect the following to work:

?- call(string,"abcd",[],[],R).
false.

What am I missing? (Using SWIP 8.4.1.)

Who says this should work. The qq_syntax/4 hook is not a DCG, but may choose between reading the content as either a stream or use a DCG. Anyway, to read a pure string you can use the QQ in library(strings) which defines various ways to get strings with nice layout, interpolation, etc.

The API can do anything I can think of, converting the input into an arbitrary Prolog term at compile time that can be a (dict) function to ensure further evaluation at runtime.

1 Like

Nobody, but this was the purpose of my original question.

This is what I’m trying to understand. To use the example in the manual, if I have a “Syntax” expression html(Name, Address) and the string content “<tr><td>Name<td>Address</tr>”, is there an API that will do what the QQ hook does, but not at parse/compile time? One application might be a meta-interpreter for QQ’s.

SyntaxName/4 appears to do this except that the Content is an “opaque term”. So is there a way of constructing such a term from a string? For example, if I wrote the string to a stream and used that somehow.

I don’t get the intend yet. You can of course use e.g. term_string/2,3 to use the (QQ) parser with user generated content. What is your use case?

At this point I’m in “exploratory” mode, just trying to find out what’s possible. The basic intent is to be dynamically call any defined QQ parser without invoking the Prolog parser.

As I understand the suggestion to use, e.g. term_string/2,3, this would require me to reconstruct a QQ string from a Syntax term and a Content string. The parser would then deconstruct it again, then call the appropriate QQ parser. This seems a little wasteful but I’ll give it a try.

OK. I think the following code fragment does what I want:

	term_string(Syntax, QQsyntax, [variable_names(Vars)]),
	atomics_to_string(['{|', QQsyntax, '||', Content, '|}'], QQuote),
	term_string(Term, QQuote, [variable_names(Vars)]),

It doesn’t avoid using the builtin parser, but probably good enough.

But term_string seems a bit erratic:

?- term_string(S, "{|string(To,From)||to:To from:From|}", [variable_names(['To' = To, 'From' = From])]).
S = strings{type:string}.exec(["to:To from:From"], ['To'=To, 'From'=From]).

?- term_string(S, "{|html(Name,Address)||<tr><td>Name<td>Address</tr>|}", [variable_names(['Address'=Add, 'Name'=Name])]).
false.

?- trace,term_string(S, "{|html(Name,Address)||<tr><td>Name<td>Address</tr>|}", [variable_names(['Address' = Add, 'Name' = Name])]).
   Call: (11) term_string(_32558, "{|html(Name,Address)||<tr><td>Name<td>Address</tr>|}", [variable_names(['Address'=_32526, 'Name'=_32538])]) ? creep
   Fail: (11) term_string(_32558, "{|html(Name,Address)||<tr><td>Name<td>Address</tr>|}", [variable_names(['Address'=_32526, 'Name'=_32538])]) ? creep
false.

Any idea what I’m doing wrong?

I think it’s just a matter of wrong order of the variable_names argument, which should correspond to the linear order in which the variables are scanned.
Try:

?- term_string(S, "{|html(Name,Address)||<tr><td>Name<td>Address</tr>|}", [variable_names(['Name' = Name, 'Address' = Add])]).
S = [element(tr, [], [element(td, [], [Name]), element(td, [], [Add])])].

Thanks for the hint.; that indeed is the issue. But seems like a bug to me; no hint of this in the doc and write_term/2 has no such restriction:

?- write_term(html(Address,Name),[variable_names([])]).
html(_51398,_51400)
true.

?- write_term(html(Address,Name),[variable_names(['Address'=Address, 'Name'=Name])]).
html(Address,Name)
true.

?- write_term(html(Address,Name),[variable_names(['Name'=Name, 'Address'=Address])]).
html(Address,Name)
true.

Docs:

variable_names(Vars)
Unify Vars with a list of‘Name = Var’, where Name is an atom describing the variable name and Var is a variable that shares with the corresponding variable in Term. (ISO). The variables appear in the order they have been read.

That surely hints that the order matters I’d say … I don’t see a realistic case where you’d want this to handle any order. If you need that, just use a sort/2 after the read_term/3 and you’ll have the variables in alphabetical order. Or, more common, use memberchk/2 to get the variables yoiu are looking for.

Forgive me if I am saying obvious things. I will surely make false claims and then someone could be so kind and correct me.

Quasi-quotations are a mechanism for putting arbitrary content within a Prolog source file, capturing Prolog variables from the context, and converting it to a proper Prolog term at compile time. I am not sure what is the correct name for this: it is not compile-time or run-time, it is “writing the code”-time. (help, someone?)

You can and usually want to have content that can be parsed with a formal grammar. The SWI-Prolog implementation helps you by hooking your parser to the compiler and running it at compile time.

Existing parsers:

On the informal end of the spectrum, you have the quasi-quotation syntax for plain strings provided by library(strings), see the source file, close to the top. Jan was kind enough to implement it, maybe provoked by my incessant questions.

You have already found the HTML quasi-quotation parser; just in case, here is the source.

After this too long intro, I think that nothing prevents you from reusing the parsers that quasi-quotations use, without the quasi-quotation mechanism: maybe using it on a user-provided string at run time? Or am I wrong again?

1 Like

I think it is more the other way around: given a parser using a DCG or something that reads from a stream you can implement a QQ handler. That still allows you to use the parser directly, i.e., without using read_term/3 or some variant. The QQ handler talks to an opaque data structure that allows (re)using code that is not related to QQ. One of the roles of the intermediate representatiion is to forward error/warning locations properly.

1 Like

I now see what’s happening. In my use case, I already have a list of Name = Var bindings. When term-string/3 is invoked, it creates its own such list and tries to unify that list with the one I passed in as a variable_names option. So to ensure the same names refer to the same variables, I need to do a merge of the two lists after the call to term_string:

	term_string(Syntax,QQsyntax,[variable_names(Vars)]),  % writes Syntax term to a string using pre-existing variable names?
	atomics_to_string(['{|',QQsyntax,'||',Content,'|}'],QQuote), % construct full QQ as string
	term_string(Term,QQuote,[variable_names(VarNames)]), % parse it and get new Var=Name bindings
	mergeNamedVars(VarNames,Vars).  % uses memberchk to unify vars with the same name

This is certainly workable but seems like a roundabout way to call an individual QQ parser with a string as content using the documented QQ interface:

call(+SyntaxName, +Content, +SyntaxArgs, +VariableNames, -Result)

And I think this is what I’m trying to get my head around. How would I write a parser for some arbitrary syntax which could invoke such a QQ handler? Note that by using term_string I’ve pretty much lost any information relevant to error/warning locations. Now if the Content term was transparent, I might be able to construct the appropriate “intermediate representation”, and keep everybody happy.

Pretty much everything about Prolog is dynamic, including things like operator definitions, so while this might be the current state of Quasi-quotations, I’m trying to understand why that is the case and why it’s desirable that it should be so. Other than intellectual curiosity, I have no particular agenda that would motivate changing the status quo.

I think this is incorrect. The only way I see to access individual QQ parsers is by (re)constructing a QQ formatted string and giving it back to the builtin parser; workable, but a little awkward IMO.

One thing to note is that - if I understand correctly - the QQ interface allows existing Prolog parser/interpolators to be used also at the time of reading source terms, but that means that every QQ parser is backed by some plain Prolog code that is either a DCG or a predicate which may read from the input stream, so you could use this Prolog code directly.

For example, the string QQ interface is backed by interpolate_string/4, which my debug_adapter package calls directly to, well, interpolate strings with Prolog variables bound in a debugged frame (see code here).

2 Likes

This is true but I think it would be better if there were a standard way of accessing it. One possibility is to define an additional multifile QQ API predicate, e.g.,

qq_parse(+Syntax,+String,+VariableNames,-Result).

It’s pretty much the same implementation as the current parser hook except there’s no need to use with_quasi_quotation_input/3 as in, e.g., string/4:

string(Content, Args, Binding, DOM) :-
    ...
    with_quasi_quotation_input(Content, Stream, read_string(Stream, _, String)),
    ...

Instead you could write:

qq_parse(string(Args),String,Binding,DOM)

Now I can accomplish the same thing using term_string and re-invoking the builtin parser. It’s just more complicated and less efficient than the qq_parse (or equivalent) solution.