Need for data selection and data refinement best practices

EricGT · July 31, 2022, 11:27am

In another post Jan W. noted

I don’t really like the findall/setof/… solutions for two reasons. First of all, if something unexpectedly fails you typically loose part of the answer without any feedback and second because it involves copying. Copying ground terms is fine (just costs memory and time), but for non-ground terms it already gets harder and for terms with constraints it quickly leads to incorrect results. I’d use this instead:

group_by_arg3(Terms, Grouped) :-
    map_list_to_pairs(arg(3), Terms, Keyed),
    keysort(Keyed, Sorted),
    group_pairs_by_key(Sorted, KeyGrouped),
    pairs_values(KeyGrouped, Grouped).

See SWISH – SWI-Prolog for SHaring

That is the solution using library predicates. Alternatively, use sort/4 to sort the list on the 3th argument and do the splitting by hand. That is faster, but requires more writing.

For findall/setof/… is the complete list the predicates listed at Finding all Solutions to a Goal?

findall/3
findall/4
findnsols/4
findnsols/5
bagof/3
setof/3

If something unexpectedly fails you typically loose part of the answer without any feedback.

I have no problem with that statement but noting it as a matter of fact so that others don’t think it is not important because I did not have it listed, it is important. (Others should ask questions if needed.)
Copying ground terms is fine (just costs memory and time)

Is there a way to identify if a copy is happening with a predicate?

I am thinking of something similar to the problem of knowing if a predicate leaves a choice point and how to find the predicate leaving the choice point, e.g.
  a. To check if a predicate is deterministic (leaves no choice point) there is det/1
  b. To see the predicate leaving the choice point
    1. When at a top level prompt use * (ref)
    2. When debugging use gtrace/0 (ref)

The impetus for these questions is that many programmers are familiar with SQL and for this case specifically the SQL select statement. When learning to select and refine data many programmers will naturally try to leverage their knowledge of SQL in learning/using data selection and refinement with Prolog and thus the use of findall/setof…, sort/2, Aggregation operators on backtrackable predicates, etc. However as you note, Prolog is different especially when terms are not ground.

For those seeking a more inclusive list of Prolog items related to SQL see this.

For the longest time I had to guess at which predicates to use when printing and then you noted

Best practices for printing

A best practices is needed for

Data selection.
Data refinement.

jan · July 31, 2022, 12:15pm

Not really. Guess it should be documented. It isn’t too hard to give the general rules though.

Explicit copy_term/2 (doesn’t copy ground (sub)terms), duplicate_term/2 (also copies ground (sub)terms).
Extern/intern pairs (write → read, assert → clause, record → recorded, etc.)
Preserving terms over backtracking. That covers the all solutions predicates (findall/3, etc.) as well as nb_setarg/3, nb_setval/2, etc.

That covers (I think) most of it. You also get (partial) copies in rules of this type:

 p(a(X)) -> q(a(X)).

which creates a copy of the a/1 compound (but not of X). This can be avoided using the construct below. Some systems make such transformations automatically. There is no semantic impact of such a transformation except when using setarg/3 and friends.

 p(A) -> A = a(X), q(A).

Topic		Replies	Views
Coding problem Help!	23	1931	September 12, 2020
How to check for unique "primary keys" Nice to know how-to	4	704	December 23, 2020
Option vs pairs: when to use which Predicate	5	1113	March 5, 2020
Series of "procedural" predicates -- removing choice points Help!	33	2379	June 4, 2019
Predicate: group_by/4 Example	7	805	December 15, 2020

Need for data selection and data refinement best practices

Related topics