Del_dict/4 behaviour and dict iteration

oskardrums · July 3, 2021, 11:23am

Hey all,
I want to iterate over a possibly nested dict and apply some transformation on it’s atomic members, so I tried something along these lines:

transformed(D0, D) :-
    del_dict(Key, D0, Value0, D1),
    ( atomic(Value0) -> foo(Value0, Value)
    ; transformed(Value0, Value)
    ),
    transformed(D1, D2),
    put_dict(Key, D2, Value, D).

But, this approach doesn’t work as del_dict/4 throws an instantiation error:

Welcome to SWI-Prolog (threaded, 64 bits, version 8.3.26)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- D0 = dict{a:b}, del_dict(A, D0, B, D).
ERROR: Arguments are not sufficiently instantiated
ERROR: In:
ERROR:   [11] del_dict(_26520,dict{a:b},_26524,_26526)
ERROR:   [10] '<meta-call>'(user:user: ...) <foreign>
ERROR:    [9] toplevel_call(user:user: ...) at /opt/local/lib/swipl/boot/toplevel.pl:1115
?-

Is this the behavior of del_dict/4 intended?
Compare to get_dict/3, which doesn’t throw and instead unifies A with some key:

?- D = dict{a:b}, get_dict(A, D, B).
D = dict{a:b},
A = a,
B = b.

What would be the recommended way to perform such a dict transformation?

Thanks.

jan · July 3, 2021, 12:02pm

There is no need to delete first, is there? So you can use get_dict/3 and put_dict/3 together. This suggests however that you want to use some failure driven loop. That is not going to work. Dicts are subject to the normal Prolog backtracking. You can use nb_set_dict/3 for non-backtrackble destructive assignment. Not sure I’d recommend that

Quite often dict transformation use dict_pairs/3 and than manipulate the pair list to call dict_pairs/3 again and create a new dict.

oskardrums · July 3, 2021, 12:34pm

Thank you for the quick response.

I’ve neglected in the above snippet the termination condition, should’ve been:

transformed(Tag{}, Tag{}).
transformed(D0, D) :-
    ...

So my thinking was that deletion is needed in order to get to the termination condition, otherwise I get an infinite loop:

transformed(Tag{}, Tag{}).
transformed(D0, D) :-
    get_dict(Key, D0, Value0),
    ( atomic(Value0) -> foo(Value0, Value)
    ; transformed(Value0, Value)
    ),
    transformed(D0, D2),
    put_dict(Key, D2, Value, D).

?- transformed(_{a:b, c:_{d:e}}, T).
ERROR: Stack limit (1.0Gb) exceeded
ERROR:   Stack sizes: local: 0.7Gb, global: 0.2Gb, trail: 39.3Mb
ERROR:   Stack depth: 2,572,409, last-call: 0%, Choice points: 2,572,402
ERROR:   Probable infinite recursion (cycle):
ERROR:     [2,572,408] transformed(<compound C'dict'/5>, _66885952)
ERROR:     [2,572,407] transformed(<compound C'dict'/5>, _66885978)

It works if I add the deletion:

foo(X, Y) :-
    atomic_list_concat(['bar', X, 'baz'], '/', Y).

transformed(Tag{}, Tag{}).
transformed(D0, D) :-
    get_dict(Key, D0, Value0),
    del_dict(Key, D0, _, D1),
    ( atomic(Value0) -> foo(Value0, Value)
    ; transformed(Value0, Value)
    ),
    transformed(D1, D2),
    put_dict(Key, D2, Value, D).

?- transformed(_{a:b, c:_{d:e}}, T).
T = _3574{a:'bar/b/baz', c:_3560{d:'bar/e/baz'}} .

But that get_dict, del_dict dance is not the peak of elegance.

So I guess dict_pairs/3 it is
Thanks @jan

jan · July 3, 2021, 3:54pm

True. The docs make is explicit not to make claims about the internal structure of dicts though. For some deep-down reason the keys and values have swapped position. Other changes may follow (there are no concrete plans).

jan · July 3, 2021, 4:30pm

The search for the insert location is O(log(N)). Next you have to make a new dict and copy the data. Copying the data are two simple memcpy() calls (the part before the new key and the part that follow it). So, yes, insert in O(n), but the constant factor is pretty low.

Long time I did a comparison with rbtrees and If I recall correctly rbtrees started to win around 1,000 k-v pairs. May have changed a bit. In other words, used for representing compound data structures they are typically fine. They are not suitable to implement (for example) word-counting a document using for each word, update the count in the dict by one. The way to implement that is to make a flat list of words, use sort/4 (or msort/2) to sort while keeping duplicates and walk over the list to establish the counts. Next you can do dict_pairs/3 to create a dict for subsequent lookup if an on-stack data structure is desirable.

oskardrums · July 3, 2021, 10:22pm

I didn’t consider the O(n) cost of copying the dict on each del_dict/4 and put_dict/4, thank you!
My actual per-atomic logic is quite expensive itself, so the traversal does not seem to be a significant bottleneck.
I’ve put together a quick benchmark that applies our actual transformation to the atomic elements and got a 2.8 percent speedup with the dict_pairs/2 approach and a 4.2 percent improvement with (=…)/2, both compared to my original get_dict/3, del_dict/4, put_dict/4 based implementation.

Topic		Replies	Views
Surprising Transformation for Dictionaries and Yall Help!	3	417	March 12, 2020
Question about dictionaries and their nuances Help!	10	2302	August 8, 2019
Dictionaries from foreign code - questions Help! ffi	3	597	January 22, 2022
Dicts in SWI-Prolog Data Structure	2	548	May 27, 2021
Strange dict error General	13	381	May 11, 2023

Del_dict/4 behaviour and dict iteration

Related topics