Dictionaries from foreign code - questions

I’ve been working on implementing dictionary support into swipl-rs. I really like dictionaries in SWI-Prolog, and we use them extensively in TerminusDB. However, it looks like the foreign function interface to dictionaries is somewhat restricted compared to what you can do from prolog itself. These are the differences I spotted.

  • Dicts can have both atoms and small integers as keys. However, the ffi only supports atoms, both in construction through PL_put_dict() and in querying through PL_get_dict_key().
  • A dict tag can be an arbitrary term (though it’s definitely ‘normal’ to have it be an atom). From the ffi, it can only be an atom.
  • There is no destructive assignment (b_set_dict and nb_set_dict in prolog). While a backtracking version would be tricky from the ffi, there is no reason why the non-backtracking version couldn’t exist.
  • There is no way to retrieve the tag of a dict.
  • There is no way to iterate over all key-value pairs of a dict.

Regarding the last 3 points, I can probably implement that myself easily by just destructuring the dict term. While that would mean relying on the implementation details of dicts, I’m probably able to spot any upcoming changes to the implementation before they hit stable. Similarly, unifying the tag with an arbitrary term could also be done by destructuring a dict, after creating it with a var as a tag.

Supporting small integers as keys is more tricky though. The way things are now, that would require me to manually build the dict term, which means I have to sort the keys in the same way SWI-Prolog does it.
I understand that key-value pairs are sorted by key in the dict term, but they are not sorted as they would if you were to sort in prolog. Instead, they are sorted by some internal representation, which from experiments seems dependent on the underlying atom_t value. However, I don’t really understand how this sort works when small integers are involved too. For any atom key, I’m able to devise, through trial and error, small integer keys that will appear before it, and small integer keys that will appear after it. So it would appear that the internal representation used by dictionary sorting has mapped both atoms and small integers into the same range. How exactly is this done, and is this something I could rely on myself to construct my own dict terms?

Its all not that bad :slight_smile: Dealing with integer keys means exposing the consInt() macro (not directly of course), the result of which you can use for the atom_t keys. Of course, this needs introducing some types and function abstractions but technically it is trivial. The dirty way is to expose functions that creates and extracts small integers with the atom_t type. Otherwise we’d need a key_t type and have some macro (?) that casts an atom_t to/from a key_t. In fact, all these are uintptr_t, so it doesn’t matter.

Getting the tag is just PL_get_arg(1, Dict, Tag). Of course that also needs abstracting as e.g., PL_get_dict_tag(Dict, Tag) such that we do not get into trouble should the representation change.

Internally there is an iterator, PL_for_dict(). This may needs some reviewing before exposing.

b_set_dict and nb_set_dict could be supported by first exposing setarg/3 and nb_setarg/3 and then layering these on top of them. There are internally functions that figure out how a key translates to the nth argument. I guess there are good use cases for exposing these functions.

I’d stay away from the dict key ordering. As said above, atom_t and small integers are both uintptr_t and cannot overlap. So, we just sort on the integral value. That is what makes the dict operations so fast

My guess is that with these hints you can get it all done :slight_smile:

2 Likes

Thanks! yeah, consInt was the hint I needed. I can now construct dictionaries with small ints as keys, without having to mess with direct construction and key ordering myself.

One last minor thing - the documentation for PL_get_dict_key() says it will throw an exception when given a term that is not a dict. This is not what I see. It seems to just fail when I give it a number instead. Are there conditions where it will throw an exception?

From the code it seems that PL_get_dict_key() instead does not raise an exception. Fixed the docs.

1 Like