How to convert standard json into prolog json?

I’m not talking about the 70’s, I’m talking about today. Today json is ubiquitous, and stdlib/built-in functions and predicates, whatever you want to call it, can be updated to support modern use cases. Java waited until Java 17 to support pattern matching in switch cases. If you tried to use

switch(input) {
	case Foo f -> "hello"
	case Bar b -> "world"
}

in java v16, it wouldn’t have worked.

Likewise, swi is on version 9 now and has certain built-in predicates that it ships with. I can use member/2 , append/3, and so forth. They may have been added at various versions of the language. Version 10 could potentially include spec_json_to_prolog_json/2 implemented as you did, but I’m saying if it could happen in the next version, it could have also happened several versions ago.

There is a PIP (Prolog Improvement Proposal) scheduled on this. The idea is to agree on a common library interface for json. The PIP is still in its infancy (other PIPs have kept people busy) but hopefully we will be able to have a common design at some point.

1 Like

Yes :grinning:

Because e.g. Prolog supports terms, which JSON does not, and JSON supports null. So there is no guarantee of conversion, in either direction.

Well null is the least of it. { "a": 123 } is also valid json and isn’t covered without further handling.

I know I am late for the party, but I do not understand: why wouldn’t you just treat the JSON that the user provides as a string? Then you can parse it using the normal facilities?

Or, just let them input the JSON as a string? Especially if the user does not want to figure out Prolog?

Like,

?- atom_json_dict('{"a":1,"b":2,"c":3}', JSON, []).
JSON = _{a:1, b:2, c:3}.

Note that the example you gave:

?- get_json_b_param({a:1,b:2,c:3},B).

This is somehow a mixture of Prolog and JSON and while it can be parsed by Prolog it is not valid JSON at all, isn’t that so?

This interface also gives you the necessary control over JSON to Prolog mapping, see the options in the third argument of atom_json_dict/3. Most of the available options are actually documented in json_read/3.

1 Like

Many languages have some dynamic object/dict/map/… key-value set whose syntax resembles JSON. In most cases they support a superset of what JSON can represent. These language constructs are not JSON. They can merely be used to represent JSON. Typically the match is not 100% either way: not all JSON documents can be represented in the language primitive and not all language objects can be represented as JSON. Some examples

  • JSON allows for duplicate keys. Neither Python, nor JavaScript nor SWI-Prolog dicts can handle duplicate keys.
  • Some language object representations preserve the key order, others not. I don’t know the rules or implications of this.
  • SWI-Prolog can represent all JSON data types. There are some problems though. If we represent JSON strings as Prolog strings, all is fine. In practical applications, using atoms is often more natural, but than null, false and true become ambiguous as Prolog has no data type for nothing nor for Boolean.
  • JSON has no limit on the number of digits in a number. Some languages have. For SWI-Prolog, integers are unlimited, but floats are IEEE 64 bit doubles and thus limited.

In the other direction, there are many mismatches. Think of JavaScript undefined, Prolog variables, Prolog compound terms, rational numbers, etc, etc.

All these problems apply for all these languages. Each of these languages have functions to translate a string holding valid JSON into an object/map/dict/… and visa versa, each with the some of the above issues.

The syntax of JSON and the language construct may be similar or very different. For example, in Perl it looks like this. Still, Perl hashes are the obvious way to represent JSON data in Perl. Python and JavaScript object syntax is close to accepting JSON as valid input, but not perfect (Python used None rather than null).

( England => 'English',
  France => 'French', 
  Spain => 'Spanish', 
  China => 'Chinese', 
  Germany => 'German')

Edit The ordering problem is discussed in Order of JSON key, value pairs · Issue #148 · auth0/node-jsonwebtoken · GitHub

1 Like

@Boris

Yeah like I said, I’m fine if the end user has to put it in a string, put an underscore, or whatever they have to do; I just have to put it in the readme then so they know they have to do that. I was only hoping with this post that there was a way for them not to have to do that, again, the same way that you don’t have to do that in the python implementation. In the python implementation if you have {a:1}, you can just run encode({a:1}), not '{a:1}' and not _{a:1}.

Actually thanks Boris, atom_json_dict or atom_json_term looks like probably the cleanest solution here if I have them pass in cmdline args like swipl jwtpl -- {"a":1} secret, catch them with current_prolog_flag(argv,[Claims,Secret|_]), and I believe those are atoms?

@jan

JSON allows for duplicate keys.

Does it? I just checked { "a": 1, "b": 2, "a":3 } on https://jsonlint.com/ and got “Invalid JSON! Error: Duplicate key ‘a’”. But either way I got your point, there are some limitations. I’m not sure how long proposals like The Prolog Implementers Forum take to get implemented but I’ll be looking out for that.

How strong is the json validator in swi? So like

?- atom_json_dict('{"a":1,"b":null,"c":3}', JSON, []).
JSON = _{a:1, b:null, c:3}.

?- atom_json_dict('{"a":1,"b":null,"c":3', JSON, []).
ERROR: Stream <stream>(0x600000601d00):1:21 Syntax error: json(illegal_object)

?- atom_json_dict('{"a":1,"b":null,:3}', JSON, []).
ERROR: Stream <stream>(0x60000060c200):1:17 Syntax error: json(illegal_json)

?- atom_json_dict('{"a":1,"b":null,'c':3}', JSON, []).
ERROR: Syntax error: Operator expected

?- atom_json_dict('{"a":1,"b":null,"c":3,"c":4}', JSON, []).
ERROR: Duplicate key: c

?- atom_json_dict('{"a":1,"b":null,"c":["qw","as,"zx"]}', JSON, []).
ERROR: Stream <stream>(0x600000604b00):1:32 Syntax error: json(illegal_array)

These all appear to be catching the right things, though I wish the errors were a little more consistent. ie. Why do we see json(illegal_object) and json(illegal_json) for some but not others?

But it looks like we do some validation and throw errors for invalid json. Is that fairly comprehensive?

Thanks.

You have unescaped quotes in there

I know that. That was intentional. I’m talking about handling cases where the user passes in a malformed json. I really appreciate everyone’s input btw, thank you.

It seems a difficult thing. See e.g., standards - Does JSON syntax allow duplicate keys in an object? - Stack Overflow Bottom line seems to be that duplicate are allowed by the syntax but it is not recommended to use them and some tools will either reject them or only preserve one of the values.

Yes, but atom_json_dict/3 it is flawed for this purpose as it looses the ordering. That is a similar problems as duplicate keys: order is typically considered irrelevant, but similar as one can process duplicate keys, one can also process a specific ordering of the keys. As with the link I gave to deal with ordering, JWT computation depends on the key ordering. So, the older atom_json_term/3 does work, but the Prolog representation of the JSON term is not so nice and processing the representation is much clumsier and slower.

I think the parser is pretty strict. It mainly claims that any valid JSON element creates a correct Prolog representation though. If there are issues I’d suspect first of all floating point parsing.

The are all of the form error(syntax_error(json(Detail))), Context), where Detail gives some hint about what is wrong and Context some information on where it went wrong. As Prolog exception handling is based on unification, they all unify with the term above, i.e., using a variable to match the detail.

A typical skeleton in (SWI-)Prolog is

    catch(Goal, Error, true),
    (   var(Error)
    ->  <continue>
    ;   print_message(error, Error))
        <stop/recover/...>
    )
1 Like

Good tips, thank you.

JWT computation depends on the key ordering.

See, I did not know that. I’ll say, I’m not the savviest programmer in the world. My plan was to take that Jose jwt implementation and emulate it as best I can. I just really want to do it in prolog.

So it sounds like atom_json_term/3 is my best bet? Any downsides to that? What about utilities like dict_keys/2 and dict_size/2? I’m looking through the documentation but can’t really find those for the term representation yet. Would I need to roll those myself? Do I need to look into converting it with predicates from here?

Or is the idea that if I’m working with a term, and say I want to get value by key, I would just do this?

?- atom_json_term('{"a":1,"b":null,"c":["qw","as","zx"]}', Jterm, []),
    json(Json) = Jterm,
    member(c=C,Json).                                      
Jterm = json([a=1, b= @(null), c=[qw, as, zx]]),
Json = [a=1, b= @(null), c=[qw, as, zx]],
C = [qw, as, zx].

Is this the idiomatic way of doing this?

Python used to have an OrderedDict that preserved insertion order (the regular dicts had “random” order, as controlled by PYTHONHASHSEED). This proved to be sufficiently problematic(*) that eventually someone came up with a sufficiently efficient implementation of OrderedDict that the regular dict became the same as OrderedDict.
In Python, you can get dict d sorted by key: dict(sorted(d.items())).

So, if Prolog’s dict can by modified to preserve ordering, we could add dict_sorted_pairs/2 (same as the current dict_pairs/2) and dict_ordered_pairs/2.

(*) At Google, our group had to handle the company-wide breakage when the new Python version introduced randomized hashes … the documentation had always said that programmers couldn’t depend on the ordering of dict keys but people did anyway, and it took us several person-years to fix the 10s of thousands of test breakages (and that was with using a lot of automated tools).

Just one more thing: Why is this the case? Are the new jsons implemented as arrays under the hood?

I thought about that, but I think it is impossible without some drastic changes. In the end, a dict is a Prolog term that behaves the normal Prolog way for unification, ==/2 compare/3, etc. If we keep the order, _{a:1, b:2} must be a different term than _{b:2, a:1}. This implies we need to modify all these basic algorithms as well (and define how they should behave).

But yes, I agree it would be nice to have order preserving dicts. They improve the user experience and facilitate using e.g. JWT on them. On the other hand, the link I mentioned argues that it is JWT that is broken as a JSON object is normally considered an unordered set of key-value pairs. But, again, AFAIK, JSON is defined in terms of its syntax and not its semantics and valid JSON terms have ordering and may have duplicates. That surely was the case when I implemented JSON for Prolog. I really liked the one page description of the syntax. So cool when compared to XML :slight_smile: Since then some character escaping in strings has been added, but AFAIK, that is it. Pity.

Luckily, the current implementation has no stable ordering, so the bugs are there from day zero :slight_smile:

It is not a new JSON implementation. The Tag{Key:Value, …} are called dicts. They internally map to a Prolog compound term where the key-values form an ordered array, so we can do binary search, ordered merging, etc. The ordering is based on the atom handle, which is basically an index into a dynamic array that reflects roughly the order of creation. But, atoms are subject to GC and free places in the array are reused.

At the risk of belabouring the discussion past the point of usefulness, this is my understanding:

Consider the following example of legal JSON syntax for an JSON object:

{"a":1,"b":2,"c":3}

As it happens, this is also perfectly legal Prolog syntax at least for SWI-Prolog, and probably others, that support {..} as part of their grammar:

?- write_canonical({"a":1,"b":2,"c":3}).
{}(','(:("a",1),','(:("b",2),:("c",3))))
true.

Note that as far as Prolog is concerned, this has nothing to do with JSON. It’s largely a coincidence that any legal JSON is probably(?) also legal Prolog. The reverse is certainly not true, and don’t expect the native Prolog parser to catch JSON syntax errors.

In addition, core Prolog will also support parsing JSON content from an atom (or string), as in:

?- term_string(T,'{"a":1,"b":2,"c":3}'), write_canonical(T).
{}(','(:("a",1),','(:("b",2),:("c",3))))
T = {"a":1, "b":2, "c":3}.

As a totally independent capability, library(htp/json) can take the same string and map it to a different Prolog term structure (objects map to either a json/1 term or a dictionary), with the intent (I think) of simplifying the post processing (and generation) of the content:

?- atom_json_term('{"a":1,"b":2,"c":3}',T,[]).
T = json([a=1, b=2, c=3]).

?- atom_json_dict('{"a":1,"b":2,"c":3}',T,[]).
T = _{a:1, b:2, c:3}.

My conclusion is that currently there is no standard, or idiomatic, way of dealing with JSON content. Further the options available when using SWI-Prolog may not be supported in other Prolog’s. The application programmer has choices and needs to pick the one that best suits their needs.

1 Like

True. There are some corners like escapes in strings and details on floats, notably non-normal floats. Note that the {...} syntax is part of the ISO standard

That is what we are trying to resolve in the PIP meetings. It is non-trivial though as the best representation depends on capabilities of the Prolog system at hand and the use-cases you imagine. It is also felt that we must “solve” how dict-like structures can be supported over multiple Prolog systems. We all have solutions for this, but all these originate from a different view on what dicts are.

Forgive me; I’ve never read the actual ISO standard because it’s behind a paywall.

Can I get help on one thing please? I’m trying to simulate python json.dumps() which converts a python object to a json literal. So I’m trying to write a simple limited grammar that converts a prolog json term to json literal.

Here’s what I have so far

jsonpl_jsonspec(Json) --> ob, jsonify(Json), cb.
jsonify([K=@(V)]) --> dq(K), cl, [V].
jsonify([K=V]) --> { atom(V) }, dq(K), cl, dq(V).
jsonify([K=V]) --> { number(V) }, dq(K), cl, [V].
jsonify([K=@(V)|Js]) --> dq(K) ,cl, [V], cm, jsonify(Js).
jsonify([K=V|Js]) --> { atom(V) }, dq(K), cl, dq(V), cm, jsonify(Js).
jsonify([K=V|Js]) --> { number(V) }, dq(K), cl, [V], cm, jsonify(Js).

dq(A) --> ['"'],[A],['"'].
ob --> ['{'].
cb --> ['}'].
cm --> [','].
cl --> [':'].

A couple of questions:

First, when I try to load this I’m getting Syntax error: Operator expected for the two lines with @(V). Why? What I’m trying to do is, if I have a prolog json term with foo=@(null) then I want to convert it to "foo":null.

Second, when I trace phrase(jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123]),X). I’m not clear on why it’s is failing.

?- trace, phrase(jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123]),X).
^  Call: (13) phrase(jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123]), _17242) ? creep
   Call: (16) jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123], _17242, []) ? creep
   Call: (17) atom(foo) ? creep
   Exit: (17) atom(foo) ? creep
   Call: (17) _21334=_17242 ? creep
   Exit: (17) _17242=_17242 ? creep
   Call: (17) dq(a, _17242, _22956) ? creep
   Call: (18) _23776=[a|_23782] ? creep
   Exit: (18) [a|_23782]=[a|_23782] ? creep
   Call: (18) _23782=['"'|_22956] ? creep
   Exit: (18) ['"'|_22956]=['"'|_22956] ? creep
   Exit: (17) dq(a, ['"', a, '"'|_22956], _22956) ? creep
   Call: (17) cl(_22956, _27844) ? creep
   Exit: (17) cl([:|_27844], _27844) ? creep
   Call: (17) dq(foo, _27844, _29472) ? creep
   Call: (18) _30292=[foo|_30298] ? creep
   Exit: (18) [foo|_30298]=[foo|_30298] ? creep
   Call: (18) _30298=['"'|_29472] ? creep
   Exit: (18) ['"'|_29472]=['"'|_29472] ? creep
   Exit: (17) dq(foo, ['"', foo, '"'|_29472], _29472) ? creep
   Call: (17) cm(_29472, _34360) ? creep
   Exit: (17) cm([','|_34360], _34360) ? creep
   Call: (17) jsonify([b= @(null), c='Wow', d= @(true), e=123], _34360, []) ? creep
   Call: (18) atom(@(null)) ? creep
   Fail: (18) atom(@(null)) ? creep
   Redo: (17) jsonify([b= @(null), c='Wow', d= @(true), e=123], _34360, []) ? creep
   Call: (18) number(@(null)) ? creep
   Fail: (18) number(@(null)) ? creep
   Fail: (17) jsonify([b= @(null), c='Wow', d= @(true), e=123], _34360, []) ? creep
   Redo: (16) jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123], _17242, []) ? creep
   Call: (17) number(foo) ? creep
   Fail: (17) number(foo) ? creep
   Fail: (16) jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123], _17242, []) ? creep
^  Fail: (13) phrase(user:jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123]), _17242) ? creep
false.

So

Exit: (17) dq(a, ['"', a, '"'|_22956], _22956) ? creep
Exit: (17) dq(foo, ['"', foo, '"'|_29472], _29472) ? creep

a=foo"a":"foo" works correctly.

Then we go to b= @(null)

  • Fail: (18) atom(@(null)) is correct.
  • Fail: (18) number(@(null)) is correct.

but we skipped my jsonify([K=@(V)|Js]) clause, and also right here I’m confused

Fail: (17) jsonify([b= @(null), c='Wow', d= @(true), e=123], _34360, []) ? creep
Redo: (16) jsonify([a=foo, b= @(null), c='Wow', d= @(true), e=123], _17242, []) ? creep

Why are we backtracking to the start of the list? Why didn’t we try jsonify([K=@(V)|Js])? I should be allowed to do that, no? We can do

|    J = [a=foo, b= @(null), c='Wow', d= @(true), e=123],
|    member(b=V,J),
|    V = @(V_).
J = [a=foo, b= @(null), c='Wow', d= @(true), e=123],
V = @(null),
V_ = null .

Apologies in advance if I’m missing something obvious. Thanks.

This is tokenized as “K” “=@”, “(”, “V”, “)” because (most) symbol characters glue together to an atom. You need a space, as in K= @(V).

And, why not use the library converter? E.g., atom_json_term/3?