Upstreaming `ord_lookup`, `ord_range` and `rb_visit_range`

Hello all,

While working on my egraph library, I have written some helper predicate to use with ordsets and rbtrees.
I am wondering if these could be useful for the community, so I am exposing them here.
Let me know if one of them could be useful to you, in which case, I’ll do a pull request on swipl-devel to upstream them.

  1. ord_lookup((+Key)-(-Value), +SortedPairs): efficiently retrieve a Value given a Key from a sorted list of pairs
  2. ord_range(+Set, +Min, +Max, -Range): efficiently retrieve a range of elements from a set given a Min and Max bounds
  3. rb_visit_range(+Tree, +Min, +Max, -Range): retrieve a range of Pairs from an rbtree given a Min and Max bounds on keys.

Let’s start with ord_lookup/2:

%!  ord_lookup(+Pair, +SortedPairs) is semidet.
%
%   Retrieves a value from a sorted list of pairs using standard term comparison.
%   The search is unrolled for performance. Adapted from ord_memberchk/2.
%
%   @arg Pair A `Key-Value` pair where `Key` is the target key to find, and `Value` is unified with the associated value.
%   @arg SortedPairs A list of Key-Value pairs sorted by Key.

ord_lookup(Item-V, [X1-V1, X2-V2, X3-V3, X4-V4|Xs]) :-
   !,
   compare(R4, Item, X4),
   (   R4=(>)
   ->  ord_lookup(Item-V, Xs)
   ;   R4=(<)
   ->  compare(R2, Item, X2),
      (   R2=(>)
      ->  Item==X3, V = V3
      ;   R2=(<)
      ->  Item==X1, V = V1
      ;   V = V2
      )
   ;   V = V4
   ).
ord_lookup(Item-V, [X1-V1, X2-V2|Xs]) :-
   !,
   compare(R2, Item, X2),
   (   R2=(>)
   ->  ord_lookup(Item-V, Xs)
   ;   R2=(<)
   ->  Item==X1, V = V1
   ;   V = V2
   ).
ord_lookup(Item-V, [X1-V1]) :-
   Item==X1, V = V1.
?- numlist(0, 8, L), random_permutation(L, V), pairs_keys_values(P, L, V),
lookup(4-Value, P).
L = [0, 1, 2, 3, 4, 5, 6, 7, 8],
V = [5, 7, 1, 3, 2, 8, 0, 4, 6],
P = [0-5, 1-7, 2-1, 3-3, 4-2, 5-8, 6-0, 7-4, … - …],
Value = 2.

ord_range/4 follows the same unrolled search logic to skip smaller elements:

%!  ord_range(+Set, +Min, +Max, -Range) is det.
%
%   Retrieves a range of elements between `Min` and `Max` (inclusive)
%   from a set using standard term comparison.
%
%   @arg Set the ordset
%   @arg Min minimum bound
%   @arg Max maximum bound
%   @arg Range Sorted list of elements between `Min` and `Max`
ord_range([_X1, _X2, _X3, X4|Xs], Min, Max, Range) :-
   X4 @< Min, !,
   ord_range(Xs, Min, Max, Range).
ord_range([X|Xs], Min, Max, Range) :-
   X @< Min, !,
   ord_range(Xs, Min, Max, Range).
ord_range([X|Xs], _Min, Max, Range) :-
   ord_take([X|Xs], Max, Range).
ord_range([], _Min, _Max, []).

ord_take([X|Xs], Max, Range) :-
   (   X @> Max
   ->  Range = []
   ;   Range = [X|Rest],
       ord_take(Xs, Max, Rest)
   ).
ord_take([], _Max, []).
?- numlist(0, 8, L), ord_range(L, 4, 6, R).
L = [0, 1, 2, 3, 4, 5, 6, 7, 8],
R = [4, 5, 6].

And finally, the same thing, but with rbtrees as rb_visit_range/4:

%!  rb_visit_range(+Tree, +Min, +Max, -Range) is det.
%
%   Retrieves a range of pairs with keys between `Min` and `Max` (inclusive)
%   from a Tree using standard term comparison.
%
%   @arg Tree the rbtree
%   @arg Min minimum bound
%   @arg Max maximum bound
%   @arg Range Sorted list of pairs between `Min` and `Max`
rb_visit_range(t(_,T), Min, Max, Pairs) =>
   visit_range(T, Min, Max, [], Pairs).

visit_range(black('',_,_,_), _Min, _Max, L0, Lf) => 
   L0 = Lf.
visit_range(red(L,K,V,R), Min, Max, L0, Lf) =>
   (   K @< Min
   ->  visit_range(R, Min, Max, L0, Lf)
   ;   K @> Max
   ->  visit_range(L, Min, Max, L0, Lf)
   ;   visit_range(L, Min, Max, [K-V|L1], Lf),
       visit_range(R, Min, Max, L0, L1)
   ).
visit_range(black(L,K,V,R), Min, Max, L0, Lf) =>
   (   K @< Min
   ->  visit_range(R, Min, Max, L0, Lf)
   ;   K @> Max
   ->  visit_range(L, Min, Max, L0, Lf)
   ;   visit_range(L, Min, Max, [K-V|L1], Lf),
       visit_range(R, Min, Max, L0, L1)
   ).
?- numlist(0, 8, L), pairs_keys_values(P, L, L), ord_list_to_rbtree(P, T),
|    rb_visit_range(T, 4, 6, Range).
L = [0, 1, 2, 3, 4, 5, 6, 7, 8],
P = [0-0, 1-1, 2-2, 3-3, 4-4, 5-5, 6-6, 7-7, … - …],
T = t(...),
Range = [4-4, 5-5, 6-6].

It is a little unclear where that should go. library(ordsets) is not about pairs. library(pairs) could be better. Problem is that this library both deal with ordered and not-ordered pairs, so that problem remains :frowning:

Did you check for current practice on any of these? If that exists we must consider aligning as first option.

I couldn’t find anything in other prolog implementation yet.
I found this earlier discussion: Selecting a range in a dictionary
which shows that at least rb_visit_range might be more broadly useful than just myself.

N.B.: Sicstus prolog has a predicate called avl_range but the equivalent is rb_values

Yes, I’m confused too as to where ord_lookup should go.
I would slightly favor putting in ordsets as the predicate only advantage over member(K-V, L), K == MyKey is that the list of pairs is sorted, meaning we can unroll the search.