Del_dict/4 behaviour and dict iteration

Hey all,
I want to iterate over a possibly nested dict and apply some transformation on it’s atomic members, so I tried something along these lines:

transformed(D0, D) :-
    del_dict(Key, D0, Value0, D1),
    ( atomic(Value0) -> foo(Value0, Value)
    ; transformed(Value0, Value)
    ),
    transformed(D1, D2),
    put_dict(Key, D2, Value, D).

But, this approach doesn’t work as del_dict/4 throws an instantiation error:

Welcome to SWI-Prolog (threaded, 64 bits, version 8.3.26)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- D0 = dict{a:b}, del_dict(A, D0, B, D).
ERROR: Arguments are not sufficiently instantiated
ERROR: In:
ERROR:   [11] del_dict(_26520,dict{a:b},_26524,_26526)
ERROR:   [10] '<meta-call>'(user:user: ...) <foreign>
ERROR:    [9] toplevel_call(user:user: ...) at /opt/local/lib/swipl/boot/toplevel.pl:1115
?-

Is this the behavior of del_dict/4 intended?
Compare to get_dict/3, which doesn’t throw and instead unifies A with some key:

?- D = dict{a:b}, get_dict(A, D, B).
D = dict{a:b},
A = a,
B = b.

What would be the recommended way to perform such a dict transformation?

Thanks.

There is no need to delete first, is there? So you can use get_dict/3 and put_dict/3 together. This suggests however that you want to use some failure driven loop. That is not going to work. Dicts are subject to the normal Prolog backtracking. You can use nb_set_dict/3 for non-backtrackble destructive assignment. Not sure I’d recommend that :slight_smile:

Quite often dict transformation use dict_pairs/3 and than manipulate the pair list to call dict_pairs/3 again and create a new dict.

Thank you for the quick response.

I’ve neglected in the above snippet the termination condition, should’ve been:

transformed(Tag{}, Tag{}).
transformed(D0, D) :-
    ...

So my thinking was that deletion is needed in order to get to the termination condition, otherwise I get an infinite loop:

transformed(Tag{}, Tag{}).
transformed(D0, D) :-
    get_dict(Key, D0, Value0),
    ( atomic(Value0) -> foo(Value0, Value)
    ; transformed(Value0, Value)
    ),
    transformed(D0, D2),
    put_dict(Key, D2, Value, D).

?- transformed(_{a:b, c:_{d:e}}, T).
ERROR: Stack limit (1.0Gb) exceeded
ERROR:   Stack sizes: local: 0.7Gb, global: 0.2Gb, trail: 39.3Mb
ERROR:   Stack depth: 2,572,409, last-call: 0%, Choice points: 2,572,402
ERROR:   Probable infinite recursion (cycle):
ERROR:     [2,572,408] transformed(<compound C'dict'/5>, _66885952)
ERROR:     [2,572,407] transformed(<compound C'dict'/5>, _66885978)

It works if I add the deletion:

foo(X, Y) :-
    atomic_list_concat(['bar', X, 'baz'], '/', Y).

transformed(Tag{}, Tag{}).
transformed(D0, D) :-
    get_dict(Key, D0, Value0),
    del_dict(Key, D0, _, D1),
    ( atomic(Value0) -> foo(Value0, Value)
    ; transformed(Value0, Value)
    ),
    transformed(D1, D2),
    put_dict(Key, D2, Value, D).

?- transformed(_{a:b, c:_{d:e}}, T).
T = _3574{a:'bar/b/baz', c:_3560{d:'bar/e/baz'}} .

But that get_dict, del_dict dance is not the peak of elegance.

So I guess dict_pairs/3 it is :+1:
Thanks @jan

True. The docs make is explicit not to make claims about the internal structure of dicts though. For some deep-down reason the keys and values have swapped position. Other changes may follow (there are no concrete plans).

Yes, I noticed the swapping.

Insert is still O(n), because you need to create/copy a new dict?
So that del_dict/4 and put_dict/4 based transform is O(n^2), right?

The search for the insert location is O(log(N)). Next you have to make a new dict and copy the data. Copying the data are two simple memcpy() calls (the part before the new key and the part that follow it). So, yes, insert in O(n), but the constant factor is pretty low.

Long time I did a comparison with rbtrees and If I recall correctly rbtrees started to win around 1,000 k-v pairs. May have changed a bit. In other words, used for representing compound data structures they are typically fine. They are not suitable to implement (for example) word-counting a document using for each word, update the count in the dict by one. The way to implement that is to make a flat list of words, use sort/4 (or msort/2) to sort while keeping duplicates and walk over the list to establish the counts. Next you can do dict_pairs/3 to create a dict for subsequent lookup if an on-stack data structure is desirable.

I didn’t consider the O(n) cost of copying the dict on each del_dict/4 and put_dict/4, thank you!
My actual per-atomic logic is quite expensive itself, so the traversal does not seem to be a significant bottleneck.
I’ve put together a quick benchmark that applies our actual transformation to the atomic elements and got a 2.8 percent speedup with the dict_pairs/2 approach and a 4.2 percent improvement with (=…)/2, both compared to my original get_dict/3, del_dict/4, put_dict/4 based implementation.

Usually in SWI-Prolog atom_concat/2 (or similarly atomic_list_concat/2) is a no go. Since atoms are expensive if they are dynamically created. SWI-Prolog has a string datatype for this purpose.

P.S.: Also seeing only a small improvement in small examples might be misleading. There is this famous story, which is also based on a linear primitive bottlenecks:

How I cut GTA Online loading times by 70%
https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times-by-70/