How to encode an IRI component in SWI-Prolog?

j4n_bur53 · January 26, 2023, 3:40pm

IRI seem to be used by a couple of formats now, like: RDF, XML 1.0 and
HTML 4.0. By component I understand name or value in a query part. How
can I encode a component towards IRI. This here doesn’t work somehow:

?- uri_encoded(query_value, '1=2ü', X), uri_iri(X, Y).
X = '1%3D2%C3%BC',
Y = '1=2ü'.

Expectation was rather to get Y = '1%3D2ü'. What am I doing wrong? Is
there some sequence of predicate calls that could give such a result in
SWI-Prolog? Is this event supported in SWI-Prolog?

jan · January 26, 2023, 5:20pm

uri_iri/2 only converts complete URIs. Actually it does it wrong as it handles the query string as a whole, i.e., without considering &=; Pushed a fix for that. As is, the library cannot produce encoded values for components.

Possibly the library needs some updates. It started life in Prolog as a neat implementation of the RFC, but that turned out far too restrictive to deal with URL/URI/IRI in the wild. It was later reimplemented in C based on minimal assumptions rather than the RFC. C because it is faster dealing with low-level character handling and this stuff is time critical in various linked data use cases. Nowadays there is more consensus on what to escape when. This library may deviate from that at some places.

mgondan1 · January 27, 2023, 8:23pm

Under MSYS2, with ctest:

35: ERROR: c:/msys64/home/c7201178/swipl-devel/packages/clib/test_uri.pl:118:
35:     test query:
35: ERROR: wrong answer for IRI (compared using ==)
35: ERROR:     Expected: 'http://x.y/z?q%3Dr=1%3D2ü&x=y'
35: ERROR:     Got:      'http://x.y/z?q%3Dr=1%3D2�&x=y'

Strange enough, it actually works from within swipl:

c7201178@PC105-C720 MINGW64 ~/swipl-devel/build
$ src/swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.1.4-DIRTY)
...
1 ?- use_module(library(uri)).
true.

2 ?- uri_iri('http://x.y/z?q%3Dr=1%3D2%C3%BC&x=y', IRI).
IRI = 'http://x.y/z?q%3Dr=1%3D2ü&x=y'.

j4n_bur53 · January 27, 2023, 10:56pm

I don’t know enough about the regression testing failure.
ü isn’t an extremly challenging Unicode code point, it
lives in a quite low Unicode block, it could be an error of

the harness in that the harness is 7-bit and not 8-bit stream.
Or it is an 8-bit stream from some locale and not UFT-8
or Latin-1_Supplement (ISO-8859-1):

?- char_code(ü, X).
X = 252.

But I guess I made an error in the html//1 example. Since
there is only encode/1 and not encode/2 that would also
take an URI part, I would need to write the code as:

?- X='1=2ü', html(a(href=encode('foo?bar='+X),''), L, []), writeq(L), nl.
ERROR: Type error: `atomic' expected, found `'foo?bar='+'1=2ü'' (a compoun

But encode doesn’t take an (+)/2 expression. Is this
a bug or feature? How do you encode a full URI, that is
itself an expression inside a html//1 call?

If I don’t lift it to the full URI, there might be the danger
that my value payload has ‘?’ and it doesn’t get encoded?
Since encode expects an URI and not a query value?

Edit 29.01.2023
Woa! I am doing all wrong. According to an example on the
SWI-Prolog website I should do something along:

predref(Name/Arity) -->
        { www_form_encode(Name, Encoded),
          sformat(Href, '/cgi-bin/plman?name=~w&arity=~w',
                  [Encoded, Arity])
        },
        html(a(href(Href), [Name, /, Arity])).

https://www.swi-prolog.org/pldoc/man?section=html-write-examples

But www_form_encode/2 and sformat/3 are both deprecated.

jan · January 29, 2023, 5:21pm

Turned out the test file is UTF-8, but this was not specified. Added an encoding/1 directive and now all seems fine.

Topic		Replies	Views
Problem with unicode characters (e.g. →) in prolog file on swiplserver Help!	8	784	January 31, 2022
How to use the http/html_write library to write HTML to a file Predicate	1	902	March 7, 2021
Website crash during search query SWI-Prolog web site and services	1	226	October 17, 2023
Byte ordering/conversion predicates? Predicate	10	1567	September 5, 2021
Encoding set to "text" iso "utf8" whereas LC_CTYPE=UTF-8 Help!	3	571	March 7, 2020

How to encode an IRI component in SWI-Prolog?

Related topics