Need for data selection and data refinement best practices

In another post Jan W. noted

I don’t really like the findall/setof/… solutions for two reasons. First of all, if something unexpectedly fails you typically loose part of the answer without any feedback and second because it involves copying. Copying ground terms is fine (just costs memory and time), but for non-ground terms it already gets harder and for terms with constraints it quickly leads to incorrect results. I’d use this instead:

group_by_arg3(Terms, Grouped) :-
    map_list_to_pairs(arg(3), Terms, Keyed),
    keysort(Keyed, Sorted),
    group_pairs_by_key(Sorted, KeyGrouped),
    pairs_values(KeyGrouped, Grouped).

See SWISH – SWI-Prolog for SHaring

That is the solution using library predicates. Alternatively, use sort/4 to sort the list on the 3th argument and do the splitting by hand. That is faster, but requires more writing.

  1. For findall/setof/… is the complete list the predicates listed at Finding all Solutions to a Goal?
  • findall/3
  • findall/4
  • findnsols/4
  • findnsols/5
  • bagof/3
  • setof/3
  1. If something unexpectedly fails you typically loose part of the answer without any feedback.

    I have no problem with that statement but noting it as a matter of fact so that others don’t think it is not important because I did not have it listed, it is important. (Others should ask questions if needed.)

  2. Copying ground terms is fine (just costs memory and time)

    Is there a way to identify if a copy is happening with a predicate?

    I am thinking of something similar to the problem of knowing if a predicate leaves a choice point and how to find the predicate leaving the choice point, e.g.
      a. To check if a predicate is deterministic (leaves no choice point) there is det/1
      b. To see the predicate leaving the choice point
        1. When at a top level prompt use * (ref)
        2. When debugging use gtrace/0 (ref)

The impetus for these questions is that many programmers are familiar with SQL and for this case specifically the SQL select statement. When learning to select and refine data many programmers will naturally try to leverage their knowledge of SQL in learning/using data selection and refinement with Prolog and thus the use of findall/setof…, sort/2, Aggregation operators on backtrackable predicates, etc. However as you note, Prolog is different especially when terms are not ground.

For those seeking a more inclusive list of Prolog items related to SQL see this.

For the longest time I had to guess at which predicates to use when printing and then you noted

Best practices for printing

A best practices is needed for

  1. Data selection.
  2. Data refinement.

1 Like

Not really. Guess it should be documented. It isn’t too hard to give the general rules though.

  • Explicit copy_term/2 (doesn’t copy ground (sub)terms), duplicate_term/2 (also copies ground (sub)terms).
  • Extern/intern pairs (write → read, assert → clause, record → recorded, etc.)
  • Preserving terms over backtracking. That covers the all solutions predicates (findall/3, etc.) as well as nb_setarg/3, nb_setval/2, etc.

That covers (I think) most of it. You also get (partial) copies in rules of this type:

 p(a(X)) -> q(a(X)).

which creates a copy of the a/1 compound (but not of X). This can be avoided using the construct below. Some systems make such transformations automatically. There is no semantic impact of such a transformation except when using setarg/3 and friends.

 p(A) -> A = a(X), q(A).
1 Like