Is_ordset/1 with atoms and variables

EricGT · May 20, 2022, 9:22am

When using is_ordset/1 with atoms and variables, the variables need to come before the atoms.

?- is_ordset([a,A]).
false.

fails because the variable is in the list after the atom.

However,

?- is_ordset([A,a]).
true.

succeeds because the variable is in the list before the atom.

Another example.

?- L = [_,_,A,B,a,b],is_ordset(L),write_term(L,[]).
[_722,_728,_734,_740,a,b]
L = [_, _, A, B, a, b].
true.

peter.ludemann · May 20, 2022, 6:55pm

In general, it’s a bad idea to have variables in terms in an ordset because they can change their comparison when they become partially or fully instantiated. Taking your example, but modifying it slightly:

?- L = [_,_,A,B,a,b], A=B, is_ordset(L),write_term(L,[]).
false.

?- L = [_,_,A,B,a,b], A=_, is_ordset(L),write_term(L,[]).
[_8038,_8044,_8050,_8056,a,b]
L = [_, _, A, B, a, b].

jan · May 21, 2022, 7:40am

I’d not go that far. For many computations about term sets ordered sets are a good solution. For example, below is a perfectly sensible way to find out which variables are shared between T1 and T2. Vars is also an ordered set, but one should simply consider it a list unless it is certain nothing touches the variables.

shared_variables(T1, T2, Vars) :-
    term_variables(T1, V1), sort(V1, SV1),
    term_variables(T2, V2), sort(V2, SV2),
    ord_intersection(SV1, SV2, Vars).

For short, ordered sets, library(assoc) and library(rbtress), etc. are perfect with variables as long as one is certain the set of elements/keys is not instantiated during the relevant part of the computation.

Note that attributed variables can be a good alternative for reasoning about variables.

peter.ludemann · May 21, 2022, 4:59pm

But - is this ever needed? Many (most?) programming languages insist that the key for a lookup be immutable, and that doesn’t seem to bother anyone.

As for shared_variables/3, this does the job without any variables to ord_intersection/3 (but the code feels rather clunky), using numbervars/1 to replace all the variables with '$VAR'(_) terms and then looking up the intersection '$VAR'(_) terms, to get the original variables:

shared_variables(T1, T2, Vars) :-
    term_variables(T1-T2, T1T2Vars),
    term_variables(T1, V1),
    term_variables(T2, V2),
    copy_term(V1-V2, CV1-CV2),
    numbervars(CV1-CV2),
    sort(CV1, SCV1),
    sort(CV2, SCV2),
    ord_intersection(SCV1, SCV2, Vars0),
    maplist(lookup_var(T1T2Vars), Vars0, Vars).

lookup_var(T1T2Vars, VarN, Var) :-
    var_number(VarN, N),
    nth0(N, T1T2Vars, Var).

kuniaki.mukai · May 21, 2022, 7:09pm

Being interested in the peter.ludemann’s code, which looks persuasive to me, I have compared speed with the Jan’s code, which I am used to be more familiar with. Unfortunately, a result is that the Jan’s code is about 30 time faster, which is more than I expected.

% ?- length(L, 1000), time(repeat(10000, shared_variables(L, L, Vs))).
%@ % 918,490,001 inferences, 70.236 CPU in 70.341 seconds (100% CPU, 13077220 Lips)
%@ L = [_, _, _, _, _, _, _, _, _|...].

% ?- length(L, 1000), time(repeat(10000, jw_shared_variables(L, L, Vs))).
%@ % 40,090,001 inferences, 2.177 CPU in 2.179 seconds (100% CPU, 18413813 Lips)
%@ L = [_, _, _, _, _, _, _, _, _|...].

peter.ludemann · May 21, 2022, 8:46pm

That doesn’t surprise me – there’s probably a lot of overhead from copy_term/2, numbervars/1, maplist(lookup_var).

For meta-level manipulation of terms, there are two ways:

var/1, atom/1, number/1, and similar to detect variables vs terms.
transform the terms into a non-“defaulty” form (e.g., transform X=a into term('=', [var('X'), atom(a)]).

For option 2, there’s a one-time overhead (unless there’s a read_term/2 option that returns the non-“defaulty” form), but after that, the cost of the various operations should be very similar – that is testing for Term=var(_) would be similar to the cost of var(Term). The non-“defaulty” form has the advantage of being able to keep extra information information (e.g., variable name, source location) that otherwise would need separate parallel book-keeping (similar to read-term/2 subterm_positions(_)).

For meta-level programming, such as term expansion, the use of var/1 is enticing; but I suspect that a non-“defaulty” representation would be more robust and easier to use in the long term.

kuniaki.mukai · May 22, 2022, 6:21am

Thank you for explaining your deep background thought in your codes.
As I have poor knowledge about such related meta-level programming technology,
I have nothing to say here. BTW, I learned var_number/2 first from your codes, which I wanted for a long time. Thanks again.

Topic		Replies	Views
How are Queries submitted to Pengine in SWISH Discussion	19	1601	November 19, 2020
Ordering residual goals in query answers Help! discussion	10	682	June 28, 2020
On sorting non-ground terms and the standard order of terms General	26	1774	January 2, 2020
Careful with ordsets Help!	4	317	August 18, 2022
Read_term_from_atom requires the option variable_names to be sorted in order of use General bug	1	103	May 1, 2024

Is_ordset/1 with atoms and variables

Related topics