How to encode an IRI component in SWI-Prolog?

IRI seem to be used by a couple of formats now, like: RDF, XML 1.0 and
HTML 4.0. By component I understand name or value in a query part. How
can I encode a component towards IRI. This here doesn’t work somehow:

?- uri_encoded(query_value, '1=2ü', X), uri_iri(X, Y).
X = '1%3D2%C3%BC',
Y = '1=2ü'.

Expectation was rather to get Y = '1%3D2ü'. What am I doing wrong? Is
there some sequence of predicate calls that could give such a result in
SWI-Prolog? Is this event supported in SWI-Prolog?

uri_iri/2 only converts complete URIs. Actually it does it wrong as it handles the query string as a whole, i.e., without considering &=; Pushed a fix for that. As is, the library cannot produce encoded values for components.

Possibly the library needs some updates. It started life in Prolog as a neat implementation of the RFC, but that turned out far too restrictive to deal with URL/URI/IRI in the wild. It was later reimplemented in C based on minimal assumptions rather than the RFC. C because it is faster dealing with low-level character handling and this stuff is time critical in various linked data use cases. Nowadays there is more consensus on what to escape when. This library may deviate from that at some places.

Under MSYS2, with ctest:

35: ERROR: c:/msys64/home/c7201178/swipl-devel/packages/clib/
35:     test query:
35: ERROR: wrong answer for IRI (compared using ==)
35: ERROR:     Expected: 'http://x.y/z?q%3Dr=1%3D2ü&x=y'
35: ERROR:     Got:      'http://x.y/z?q%3Dr=1%3D2�&x=y'

Strange enough, it actually works from within swipl:

c7201178@PC105-C720 MINGW64 ~/swipl-devel/build
$ src/swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.1.4-DIRTY)
1 ?- use_module(library(uri)).

2 ?- uri_iri('http://x.y/z?q%3Dr=1%3D2%C3%BC&x=y', IRI).
IRI = 'http://x.y/z?q%3Dr=1%3D2ü&x=y'.

I don’t know enough about the regression testing failure.
ü isn’t an extremly challenging Unicode code point, it
lives in a quite low Unicode block, it could be an error of

the harness in that the harness is 7-bit and not 8-bit stream.
Or it is an 8-bit stream from some locale and not UFT-8
or Latin-1_Supplement (ISO-8859-1):

?- char_code(ü, X).
X = 252. 

But I guess I made an error in the html//1 example. Since
there is only encode/1 and not encode/2 that would also
take an URI part, I would need to write the code as:

?- X='1=2ü', html(a(href=encode('foo?bar='+X),''), L, []), writeq(L), nl.
ERROR: Type error: `atomic' expected, found `'foo?bar='+'1=2ü'' (a compoun

But encode doesn’t take an (+)/2 expression. Is this
a bug or feature? How do you encode a full URI, that is
itself an expression inside a html//1 call?

If I don’t lift it to the full URI, there might be the danger
that my value payload has ‘?’ and it doesn’t get encoded?
Since encode expects an URI and not a query value?

Edit 29.01.2023
Woa! I am doing all wrong. According to an example on the
SWI-Prolog website I should do something along:

predref(Name/Arity) -->
        { www_form_encode(Name, Encoded),
          sformat(Href, '/cgi-bin/plman?name=~w&arity=~w',
                  [Encoded, Arity])
        html(a(href(Href), [Name, /, Arity])).

But www_form_encode/2 and sformat/3 are both deprecated.

Turned out the test file is UTF-8, but this was not specified. Added an encoding/1 directive and now all seems fine.