Does SWI-Prolog have N+K-trees?

Personal notes (click triangle to expand) ---

dyn_new/2 - creates a new dynamic array, Dyn, of Max size. Size must be a positive integer. Dyn is a composite structure of two levels. The first level is a term of arity msb(Max)+1, the second levels are a term of arity 2**N, e.g.

?- dyn_new(7,Dyn).
Dyn = dyn(_5538, _5540, _5542).

This only creates the first level, the second level is created using dyn_access/3 on an as needed basis. The size calculation: N = 7, msb(7) = 3.

One of the nice things about this is setting a really large value up front does not use a lot of space up front, e.g.

?- dyn_new(1000000,Dyn).
Dyn = dyn(_1776, _1778, _1780, _1782, _1784, _1786, _1788, _1790, _1792, _1794, _1796, _1798, _1800, _1802, _1804, _1806, _1808, _1810, _1812, _1814).

msb(1000000) = 19.


dyn_access/3 - When Value is bound and at location I there is a free variable, updates Value at index I for structure Dyn otherwise fails. When Value is free, retrieves Value at index I for structure Dyn.

Example of structure populated at indexes 1,3,6,9,12,15.

?- dyn_new(16,Dyn),dyn_access(1,Dyn,1),dyn_access(3,Dyn,3),dyn_access(6,Dyn,6),dyn_access(9,Dyn,9),dyn_access(12,Dyn,12),dyn_access(15,Dyn,15).
% Located at index 1 in 1
% Created sub array of length 1 for 1
% Located at index 2 in 2
% Created sub array of length 2 for 2
% Located at index 3 in 3
% Created sub array of length 4 for 3
% Located at index 2 in 4
% Created sub array of length 8 for 4
% Located at index 5 in 4
% Located at index 8 in 4
Dyn = dyn(s(1), s(_6694, 3), s(_6832, _6834, 6, _6838), s(_6974, 9, _6978, _6980, 12, _6984, _6986, 15), _6422).

Example or reading an existing value.

?- dyn_new(8,Dyn),dyn_access(1,Dyn,1),dyn_access(3,Dyn,3),dyn_access(6,Dyn,6),dyn_access(3,Dyn,Three).
% Located at index 1 in 1
% Created sub array of length 1 for 1
% Located at index 2 in 2
% Created sub array of length 2 for 2
% Located at index 3 in 3
% Created sub array of length 4 for 3
% Located at index 2 in 2
Dyn = dyn(s(1), s(_1938, 3), s(_2076, _2078, 6, _2082), _1666),
Three = 3.

Using dyn_access/3 with an index larger than Max returns false, e.g.

?- dyn_new(7,Dyn),dyn_access(8,Dyn,8).
% Located at index 1 in 4
false.

so while not all of the variables are allocated when the dynamic array is created, it is not unlimited by memory size but by the initial value.


The most import thing I learned by doing this is that when using arg/3 you don’t have to use state variables.

In other words you might expect dyn_access to be written as dyn_access(I, Value, Dyn0, Dyn) where Dyn0 is the state being passed in and Dyn is the new state being passed out. After understanding this and looking at the Prolog code for SWI-Prolog (libraries) I see how often arg/3 is used and why.


Here was a result that I didn’t expect but makes sense.

?- dyn_new(8,Dyn),dyn_access(1,Dyn,1),dyn_access(3,Dyn,3),dyn_access(6,Dyn,6),dyn_access(4,Dyn,Four).
% Located at index 1 in 1
% Created sub array of length 1 for 1
% Located at index 2 in 2
% Created sub array of length 2 for 2
% Located at index 3 in 3
% Created sub array of length 4 for 3
% Located at index 1 in 3
Dyn = dyn(s(1), s(_6576, 3), s(Four, _6716, 6, _6720), _6304).

Created a dynamic array of size 8, added values at index 1,3,6 and then read value at index 4, e.g. dyn_access(4,Dyn,Four). Was expecting the variable Four to be bound to the a system generated variable, but was not expecting Dyn to be updated with the name of the system generated variable, e.g. s(Four, _6716, 6, _6720). In a way it makes sense, because the unification of the the variable Four and the system generated variable is successful, and the printing routine has a choice of which variable to print and chooses the one with the user generated name over the system generated name. What I was expecting was something like Four = _6701.

Using write_canonical/ to write out the value of variable Four does work as expected.

?- dyn_new(8,Dyn),dyn_access(1,Dyn,1),dyn_access(3,Dyn,3),dyn_access(6,Dyn,6),dyn_access(4,Dyn,Four),write_canonical(Four).
% Located at index 1 in 1
% Created sub array of length 1 for 1
% Located at index 2 in 2
% Created sub array of length 2 for 2
% Located at index 3 in 3
% Created sub array of length 4 for 3
% Located at index 1 in 3
_
Dyn = dyn(s(1), s(_2008, 3), s(Four, _2148, 6, _2152), _1736).

While arg/3 can set a free/unbound value, it can not set a bound value. To set a bound values requires the use of setarg/3. See: Non-logical operations on terms

Example:

  1. This demonstrates using functor/3 to create a term with name and arity. Note the term has free variables.
?- functor(S,s,5).
S = s(_6508, _6510, _6512, _6514, _6516).
  1. This demonstrates using arg/3 to bind a value to a variable for for one of the free variables.
?- functor(S,s,5),arg(3,S,a).
S = s(_6616, _6618, a, _6622, _6624).
  1. This shows that using arg/3 on a bound variable will not succeed (fail).
?- functor(S,s,5),arg(3,S,a),arg(3,S,b).
false.
  1. This shows that using setarg/3 on a bond variable will succeed.
?- functor(S,s,5),arg(3,S,a),setarg(3,S,b).
S = s(_6730, _6732, b, _6736, _6738).

When using must_be/2 with compound terms containing free variables, many of the types will fail with exceptions when using must_be/2 because of the free variables, even though the compound term with free variables is a valid term.

Note for my case: When an invalid term is used an exception is desired as opposed to failing or silently failing because during the development of the code any invalid term should stop the processing immediately so that the exact location of the problem is identified, e.g. don’t have to step through the code or look at a stack trace. Also when the creation of a predicate is done in step with the creation of the test cases, having the exception right away leads to better bug identification and better test cases as the test cases with exceptions are also created. Once the code is fully developed and tested, then a second version of the code can be created that removes most to all of the must_be/2 calls.

In writing deterministic code, one often thinks in terms of just input parameters with arguments (bound values) and output parameters (free values/variables). The input arguments are typically checked with must_be(ground,Input_1) and the output arguments checked with must_be(var,Output_1)

However these checks fail when using compound terms created that have one or more free variables.

Example

  1. A positive result of type ground when used with a compound term that has no free variable.
?- functor(S,s,2),arg(1,S,a),arg(2,S,b),must_be(ground,S).
S = s(a, b).
  1. An exception occurs if one of the variables is free. However this should be considered valid input even though it contains free variables.
?- functor(S,s,2),arg(1,S,a),arg(2,S,b),must_be(ground,S).
?- functor(S,s,2),arg(1,S,a),must_be(ground,S).
ERROR: Arguments are not sufficiently instantiated
ERROR: In:
ERROR:   [14] throw(error(instantiation_error,_9156))
ERROR:    [9] <user>
ERROR: 
ERROR: Note: some frames are missing due to last-call optimization.
ERROR: Re-run your program in debug mode (:- debug.) to get more detail.
  1. A positive result of type compound when used with a compound term that has free variable(s).
?- functor(S,s,2),arg(1,S,a),must_be(compound,S).
S = s(a, _11154).
  1. A positive result of type nonvar when uses with a compound term that has free variable(s).
?- functor(S,s,2),arg(1,S,a),must_be(nonvar,S).
S = s(a, _1680).

Looking beyond using must_be/2 for argument validation, these work for a single level compound term,

?- =(s(_,_),s(a,_)).
true.

?- subsumes_term(s(_,_),s(a,_)).
true.

and this works for a multi-level compound term

?- =(s(_,s(a,_)),s(_,_)).
true.

However here is a problem I still have.

The code was working correctly until a compound term B was added that used free variables to an existing compound term A that did not use free variables, in other words, the code worked with must_be(ground,A) but now that A contains B, must_be(ground,A) no longer works and the test has to be relaxed to must_be(nonvar,A) which is not desired.

1 Like