Intersection/3 from library(lists) is incorrect

intersection/3 can be proven incorrect like so:

?- lists:intersection([1,2,3],X,[1]).
false.

?- lists:intersection([1,2,3],[1,4,5],[1]).
true.

Not sure if this is known but I thought I should bring it up.

2 Likes

I would not agree with that.

If you read the documentation it notes

True if Set3 unifies with the intersection of Set1 and Set2.

Which I take to mean that Set1 and Set2 should be bound.

Granted it be nice if the predicate had a mode declaration and some checks, but I have no problem with how it is written. As it exist you can always add a wrapper predicate that does the checks and throws an exception or uses subtract/3. Note: That was off the top of my head, but I have no reason to believe it would not work.

HTH

Is this a matter of opinion? Even though the mode declaration certainly does specify that args 1 and 2 should be bound, I still feel that a predicate that gives incorrect answers is incorrect.

I have been thinking a lot about modes lately and what they really mean. I have been struggling with this in my own code and made a StackOverflow post a couple days ago on this exact topic.

I have no problem with it being noted as my opinion. :slightly_smiling_face:

I would have to agree with that. Seems you might have found a case worthy of analysis or reflection. I don’t know if it could be simply a change to the documentation, having a new predicate that does what you seek by using both intersection or subtraction depending on mode but I have no idea what an intuitive name would be, or does it need further analysis.

My take for now is that I am curious to hear from Jan W. as he sometimes sees things that elude me.

Ah yes, I gave the first comment as GuyCoder.

With more detail in regards to your question at StackOverflow, my personal opinion is that Prolog is a modeling language used to create a model to solve a problem. In you case you used Prolog to solve a problem without creating a model then using the model to solve the problem. Not an answer many will like but that IMHO this is how I look at many things in Prolog.

HTH

1 Like

It is common practice in Prolog land to specify the instantiation and if you call it with another instantiation anything may happen (read, wrong result or exception). The intersection/3 predicate is only valid with instantiated lists as first two arguments and sufficiently instantiated elements in the list (such that unification either fails or doesn’t bind any values). Without the latter restriction you get one answer, while multiple may be logically correct.

Is that bad? Yes and no. A logically correct definition may look nice, but how useful is it? Consider intersection([1,2], X, [1]). That is true for any list X that contains one or more 1 and no 2. That is an infinite number of solution and this will typically result in either a stack overflow or, if the garbage collector can see X is not used, in a non-terminating computation. That doesn’t make much sense. Alternatively we could type-check the arguments. If we do that everywhere this is going to be rather expensive, notably on structures where type checking is not an O(1) operation. We also have to check the elements to be sufficiently instantiated, making the work even more expensive.

IMO, this work should be left to static analysis based on type and mode declarations. That allows flagging errors and, if you want full instantiation checking, static analysis could insert a minimal number of runtime checks.

Many of the data structure manipulation predicates have these limitations. You can nice logical properties if you stay within the Datalog restrictions and use tabling or you stay within the limits of one of the constraint solvers.

Another way to look at this is that sets can be represented in two ways: as a list (or some other data structure) or as the answers to a query. If you want to do logic over sets, the latter is attractive. Now intersection simply becomes a conjunction :slight_smile:

5 Likes

My very personal opinion is that having set operations defined for lists is the origin of the confusion. It’s simply messy :slight_smile:

What @Nicholas_Hubbard describes is not even the only problem.

Now that I look at the docs for intersection/3, I see I dropped a comment back in 2013. For example:

?- intersection([a,b,a], [a], R).
R = [a, a].

?- intersection([a], [a,b,a], R).
R = [a].

union/3 has the same problem.

Take home message: lists are not sets.

1 Like

Ok I see that the problems here are very messy and there is no clear answer. Problems in Prolog seem to most often revolve around the collision between the procedural world and the logic world.

I see that this implementation would be useless, but I think I would still would find it preferable to incorrect solutions.

I also understand that checking for types with say must_be/2 can quickly lead to inefficient code which is why I agree with Jan W that static analysis would be a nice solution. I have not come across any examples of static analysis in Prolog yet so I would appreciate a bit more explanation in this regard.

2 Likes

Prolog is not really a logic language IMO. If you want logic, use ASP (Answer Set Programming). That has its own limitations. As I see it, Prolog is a sweet spot on the continuum from imperative languages to purely declarative languages. It has some purely declarative islands and ways to deal with imperative programming.

The downside of this is that Prolog is relatively hard to teach and market. The upside is that, in the hands of someone who understands the strength and weaknesses, it is a highly productive language for a large class of problems.

6 Likes

Most of the work on static analysis is (AFAIK) done by the Ciao team, resulting in the Ciao assertion language. @edison has used this to generate runtime checks. I do not know whether he tried to minimize these tests. Tom Schrijvers did work on a type checker that could add runtime checks for stuff the checker could not figure out. The add-on collection for SWI-Prolog also contains some type checkers.

Eventually this should materialize. So far the proposals are (IMO) too restrictive, ask too much work from the programmer, do not scale well, etc.

1 Like

One item I was hoping to see you note that might have been done at some time or some place with SWI-Prolog or another Prolog is adding a Hindley–Milner type system. Is there any reference for such?

The reason I note this I have partially journeyed along the path you are noting for @Nicholas_Hubbard and found that the code by Tom is not being maintained AFAIK and notes many items left open. As for the Ciao checks, it seems one would have to change to using Ciao for the full experience, but as you note there are several packages that bring this to SWI-Prolog.

?- pack_list('ciao').
% Contacting server at https://www.swi-prolog.org/pack/query ... ok
p assertions@0.0.1          - Ciao Assertions Reader for SWI-Prolog
p ciao@0.0.1                - Ciao Prolog compatibility library
p dialect-ciao@0.0.1        - Ciao Prolog compatibility library
p rtchecks@0.0.1            - Run-Time Checker for Assertions
true.

So while I am trying not to redirect what Nicholas seeks, I do have a high interest in following this.

If you want something like that in a logic (and functional) language, you could have a look at Mercury: https://mercurylang.org/index.html

AFAIK, the oldest paper on inferring types for Prolog is A. Mycroft, R.A. O’Keefe: A polymorphic type system for Prolog. DAI Research Paper 211, Dept. of Artificial Intelligence, Edinburgh University (1983)
Also available in a nicer format.

There are quite a few papers that reference it.

1 Like

A couple of months ago I took that library (typecheck.pl) and packaged it again as a pack. I also managed to fix a small bug. https://github.com/meditans/type_check (I don’t know why it doesn’t seem to be the address in the addon page, I thought I had done that).

I also started exploring the various possibilities for a type system a couple months ago, effort that has to be interrupted due to external circumstances, but in which I’m still interested in. I guess my main suggestion from that exploration is that it would be nice to have a typechecker implemented as a meta interpreter, and it would be nicer not to rely as much on code generation as the type_check library does.

personal opinion: I think that a lack of a static type system (even optional, even as a static analysis layer like in ciao) is one of the factors that’s really holding prolog back, because it makes refactoring significantly more difficult than it could be.

1 Like

IMHO I would not like to see a type check implemented as a meta-interpreter. I see type checking as a phase of compiling that comes after syntax and semantic checks but before code generation. The reason being is that one of the main ways I think of Prolog is as a language you use to create a model to solve a problem and then use the model to solve the problem. So in creating the model, the details of the typing rules could be changed. Also I still have yet to master meta-interpreters so I tend to run away when I see them, but that will eventually change.

I would agree that it is one item in a long check list that people would use to evaluate Prolog for use as a production language and not having it puts an X instead of a check for the item.

While I don’t find refactoring Prolog code something I generally dislike, my view is that someday someone will make a refactoring tool for Prolog, then the tool gets better, and then at some point in the future it is just a common part of programming in Prolog. I don’t see me being the person to write the refactoring tool, but then again a few weeks ago I would not have seen me spending a lot of time with Unix Domain Sockets and Docker, but since that is on the path of what is needed to solve my problem, it is what I am currently doing.

I will definitely be looking at your code at sometime, but not at the moment. :slightly_smiling_face:

I’m not sure I follow here, would you care to do an example? I don’t see how you can change the typing rules in something like the type_check library. If anything, I think the meta-interpreter approach could offer you a first-class notion of type, that you can then use to implement additional features (think typeclasses). My main gripe with non-meta-interpreter solution is that the user doesn’t have the ability to add things on the top of the type-system. But I’d be very interesting in hearing what your point is here, I think I misunderstand it.

That would surely be wonderful, and yet I ask myself how can we as a community get consensus on such a tool (which has many degrees of freedom in the specification) when we can’t even agree on the priority of a hindley-milner-like type system, where all the details have been well researched before.
Bluntly put, I think that’s a dream that will never materialize (but I sure hope I’m wrong :slight_smile: )

To be precise here: that’s not my code: it’s described in the paper Towards Typed Prolog and written by the authors (among which there’s also @jan). I just repackaged it in the hope it would be easier to find for the community, since the original link didn’t work.

If the problem is to model how a DNA sequence is transcribed into a protein, then one could create a domain specific language (DSL) for the problem which would be implemented in Prolog. When I noted model earlier, I am thinking in a broader sense than just a DSL but for this example I will use a DSL as a specific of a model. So a user would interact with the DSL by writing expressions in that language. If the DSL is typed, then the Hindley–Milner type system would be created for the DSL and thus not for Prolog.

I use to think that Prolog should have a type system but then over time started to think that adding a type system to Prolog would limit how Prolog can be used. By moving the type system to a model, this leaves Prolog as it is and only constrains the model.

This is all just my opinion at present as I have not implemented an actual example, but from trying simpler things like adding lots of must_be/2 and is_of_type/2 checks, to me it seemed instead what should be checked is the input, think one big check up front before the input is submitted to even the first predicate of the model and then once that was done to let the code run faster because it was not loaded down by the checks during running.

I agree that Prolog is one of the most fragmented communities of programmers I know. What I am hoping will happen is that someone will make a killer app that needs Prolog. Once that happens other programmers will take the time to learn it and then it will become main stream. Then for it not to fade into the history books, more uses for Prolog would need to be found that everyday programmers would use.

Just today I ran into Podcast 286: If you could fix any software, what would you change? and after the first few minutes wanted to write a comment asking them why did they not look beyond regular expressions? Now that I am much better with DCGs, I would rather write a custom DCG before even considering to write a Regular Expression. The problem is that many tools using RE do not accept DCGs so I have to write a Regular Expression at times.

While I would suggest that using DCGs could be a kind of killer app, if one looks further they will realize that DCGs are based on difference list and that difference list are starting to show up in other popular languages. Most of those that program in those languages do not understand the value of difference list so difference list are still are an unknown to many programmers.

Thanks, I was thinking you took and updated the code and fixed many of the open items. :frowning_face:

In the meta-interpreter approach, you can decide which parts of your code that you want to type-check, because you’re the one invoking it, so that you can leave part of your code unchecked to leverage other designs that you have in mind. Even the type-check library does that (you can tell the type checker “trust me”)

I agree with this, I think typechecking should be done separately from running your program. But I want all the checks to be static.

I fixed some code that generated a warning that annoyed me, but that’s all. I have still problems understanding the nuances of the module system, which would be the other great improvement for that library.

1 Like

I am currently rewriting my own library(sets) and
library(ordsets). union/2 is in my library(sets), but in
SWI-Prolog it is autoloaded from library(lists).

It seems the predicates are all implemented in a certain
short way with the help of cut. So that they don’t need
to check for instantiation error a list type error.

And will give one result only. For example I find:

union([], L, L) :- !.
union([H|T], L, R) :-
    memberchk(H, L), !,
    union(T, L, R).
union([H|T], L, [H|R]) :-
    union(T, L, R).

But it could be also implemented as follows:

union(X, _, _) :- var(X),
   throw(error(instantiation_error,_)).
union([H|T], L, R) :-
    memberchk(H, L), !,
    union(T, L, R).
union([H|T], L, M) :- !,
    M = [H|R],
    union(T, L, R).
union([], L, R) :- !, R = L.
union(X, _, _) :- 
    throw(error(type_error(list,X),_)).

The later version would do instantiation checking and type checking.
Like this example shows. The current SWI-Prolog version:

?- union([1|X],[2],Y).
X = [],
Y = [1, 2].

?- union([1|3],[2],Y).
false.

The proposed safer new version would behave as follows:

?- union([1|X],[2],Y).
Error: Argument should not be a variable.

?- union([1|3],[2],Y).
Error: Argument should be a list ([] or [_|_]), found 3.

The type check is already found in memberchk/2. But for
example the instantiation check is missing from memberchk/2.
So memberchk/2 is already half safe, this is already there:

?- memberchk(2, [1|X]).
X = [2|_6804].

?- memberchk(2, [1|3]).
ERROR: Type error: `list' expected, found `3' (an integer)

Edit 20.11.2020:
ECLiPSe Prolog is totally unsafe, it even doesn’t have the
second type check of SWI-Prolog:

[eclipse 1]: memberchk(2, [1|X]).

X = [2|_175]
Yes (0.00s cpu)
[eclipse 2]: memberchk(2, [1|3]).

No (0.00s cpu)

Whats also funny, you can circumvent the first
rule, sind it is not steadfast. So the first rule is
currently:

union([], L, L) :- !.

So the output argument is matched before the
cut. So its not steadfast. You can send union
into an infinite loop:

?- union(X,[],[_|X]).
ERROR: Stack limit (1.0Gb) exceeded

Whereby its not 100% clear what happens to
me. This one doesn’t go to nirvana:

?- union(X,[],[_|_]).
X = [_230034254].

I guess its that the first rule is not steadfast together with
the last rule of union, which somehow produces a new
call pattern, which is again subject to a steadfastness problem.

Edit 20.11.2020:
Doesn’t happen in the new proposed version, since it has:

union([], L, R) :- !, R = L.

I get twice:

?- union(X,[],[_|X]).
Error: Argument should not be a variable.

?- union(X,[],[_|_]).
Error: Argument should not be a variable.

This kind of behavior applies to most list processing libraries and in fact most of the Prolog library. I don’t think the solution is to pollute all the code with type and mode checks. A type system doing some combination of static and runtime checking is, as I see it, a more promising route.

1 Like