Compress a list that contains segments of continuous numbers to a list which contains list of range of segments [start,end]

I’m new in prolog and I’m trying to solve this exercise but I can not get my desired output.Compressing a list of integers to a compressed list. List contains segments of continuous integers.A segment with length greater than 2 will be encoded to list [start, end], end is inclusive. If the length is not greater than 2 will be kept no change.

Sample:

compress([3,4,6,7,8,9,10,15,16,17,20,22,23,24,25], CompressedList).

CompressedList = [3, 4, [6,10],[15, 17],20, [22,25]].

My incorrect output:

CompressedList = [[3, [4]], [6,[10]],[15, [17]],20, [22,[25]]].

My code is:

compress([], []).
compress([X], [X]).

compress([X, Y | T], [X | CompressedTail]) :-
    X + 1 =\= Y,
    compress([Y | T], CompressedTail).
    
    
compress([X, Y | T], [[X, YLast] | CompressedTail]) :-
    compress_consecutive([Y | T], YLast, Next),
    compress(Next, CompressedTail).


compress_consecutive([], Last, []) :- Last = [].
compress_consecutive([X], [X], []) :- !.
compress_consecutive([X, Y | T], Last, Next) :-
    X + 1 =:= Y,
    compress_consecutive([Y | T], Last, Next).
    
    
compress_consecutive([X, Y | T], [X], [Y | T]) :- X + 1 =\= Y. ```

I wrote a similar codes on some purpose, which is related to
normalize a set of integers intervals to a ''normal" one. I got the following result for your sample. I forgot details, but I think google search on “interval constraint or calculation” would give you some hint. If you are a student,
I would like to advice that the problem is worth to solve by yourself.

% ?- i_normal([3,4,6,7,8,9,10,15,16,17,20,22,23,24,25], CompressedList).
%@ CompressedList = [3-4, 6-10, 15-17, 20-20, 22-25].

I copied and pasted your code, consulted the file, run your query, but as a result I get:

?- compress([3, 4, 6, 7, 8, 9, 10, 15, 16, 17, 20, 22, 23, 24, 25], Compressed).
false.

So if the code you shared is correct I’m wondering how do you get that (incorrect) output, because I don’t :man_shrugging:

You’re right. the first line was missed and I added it.

FYI

I know your question is for subsequence removal but a similar compression is for run length encoding.

Note:

In case you didn’t know SWI-Prolog has the predicate clumped/2 that does RLE. (source code)

Now I get your output :+1: it’s an interesting exercise. I can’t help right now, but it could be interesting to try and solve it

If you can do it please let me know.

I’ll try…which of course it doesn’t mean I come up with a solution :rofl:
BTW I don’t know if you know about the existence of trace/0 or trace/1 for inspecting what your code is actually doing. It’s super useful

I came up with this (which isn’t exactly what you were looking for, but it makes some steps in that direction at least…):

?- compress([3, 4, 6, 7, 8, 9, 10, 15, 16, 17, 20, 22, 23, 24, 25], Compressed).
Compressed = [[3, 4], [6, 10], [15, 17], 20, [22, 25]].

A couple of clues: 1) there are some clauses that you can safely remove (they aren’t doing anything), 2) if you want to prevent backtracking the cuts should be placed in the main predicate. These are my contributions

PS: you should remove also some brackets somewhere…it’s clear that your output has too many of them :slight_smile:

1 Like

Here’s my solution; I changed the result format to use range(X,Y) instead of [X,Y] because lists should only be used for sequences that can be variable length:

code
compress([], []).
compress([Prev|Rest], Out) :-
    phrase(compress(Rest, Prev, Prev), Out).

compress([], Prev, Start) --> !,
    range(Start, Prev).
compress([X|Xs], Prev, Start) -->
    (   { X =:= Prev + 1 }
    ->  compress(Xs, X, Start)
    ;   range(Start, Prev),
        compress(Xs, X, X)
    ).

range(Start, End) -->
    (   { Start = End }
    ->  [ Start ]
    ;   { End =:= Start + 1 }
    ->  [ Start, End ]
    ;   [ range(Start, End) ]
    ).

And a few test cases (there should be more):

test cases
:- use_module(library(plunit)).

:- begin_tests(compress).

test(range, Range == [1]) :-
    phrase(range(1, 1), Range).

test(range, Range == [1,2]) :-
    phrase(range(1, 2), Range).

test(range,Range == [range(1,3)]) :-
    phrase(range(1, 3), Range).

test(compress, Compressed == []) :-
    compress([], Compressed).

test(compress, Compressed == [1]) :-
    compress([1], Compressed).

test(compress, Compressed == [1,3]) :-
    compress([1,3], Compressed).

test(compress, Compressed == [1,2]) :-
    compress([1,2], Compressed).

test(compress, Compressed == [range(1,3)]) :-
    compress([1,2,3], Compressed).

test(compress, Compressed == [1,2, range(4,6)]) :-
    compress([1,2,4,5,6], Compressed).

test(compress, Compressed == [1, 3, 4, 6, range(8,10)]) :-
    compress([1,3,4,6,8,9,10], Compressed).

test(compress, Compressed == [3,4, range(6,10),range(15,17), 20, range(22,25)]) :-
    compress([3,4, 6,7,8,9,10, 15,16,17, 20,22,23,24,25], Compressed).

:- end_tests(compress).
1 Like

you definitely love the if then else construct :sweat_smile:

I prefer if-then-else to cut.

So, here’s another solution, using SSU (I wish there were a DCG equivalent to =>):

code and tests
compress([], []).
compress([Prev|Rest], Out) :-
    phrase(compress(Rest, Prev, Prev), Out).

compress([], Prev, Start) --> !,
    { range(Start, Prev, Range) },
    Range.
compress([X|Xs], Prev, Start) -->
    (   { X =:= Prev + 1 }
    ->  compress(Xs, X, Start)
    ;   { range(Start, Prev, Range) },
        Range,
        compress(Xs, X, X)
    ).

range(Start, End, Range), Start = End       => Range = [Start].
range(Start, End, Range), End =:= Start + 1 => Range = [Start,End].
range(Start, End, Range)                    => Range = [range(Start,End)].

:- use_module(library(plunit)).

:- begin_tests(compress).

test(range, Range == [1]) :-
    range(1, 1, Range).

test(range, Range == [1,2]) :-
    range(1, 2, Range).

test(range,Range == [range(1,3)]) :-
    range(1, 3, Range).

test(compress, Compressed == []) :-
    compress([], Compressed).

test(compress, Compressed == [1]) :-
    compress([1], Compressed).

test(compress, Compressed == [1,3]) :-
    compress([1,3], Compressed).

test(compress, Compressed == [1,2]) :-
    compress([1,2], Compressed).

test(compress, Compressed == [range(1,3)]) :-
    compress([1,2,3], Compressed).

test(compress, Compressed == [1,2, range(4,6)]) :-
    compress([1,2,4,5,6], Compressed).

test(compress, Compressed == [1, 3, 4, 6, range(8,10)]) :-
    compress([1,3,4,6,8,9,10], Compressed).

test(compress, Compressed == [3,4, range(6,10),range(15,17), 20, range(22,25)]) :-
    compress([3,4, 6,7,8,9,10, 15,16,17, 20,22,23,24,25], Compressed).

:- end_tests(compress).

BTW I didn’t rewrite your code, I just “readjusted” it so that it worked in that case or in similar cases, but retrying it now with other inputs I saw that it needs some further revisions :slight_smile: the multitest approach that @peter.ludemann adopted is surely more correct…I’ll give it a deeper look tomorrow (I hate doing things in a hurry :D). Anyways thanks for having distracted me from search algorithms for a while

I would recommend not copying my solution and handing it in – it uses a few slightly “advanced” techniques that would probably alert the teacher that you didn’t write the code. Also, for learning, it’s useful to struggle with the problem. (Your solution looks fairly good, but I think it’s missing something)

2 Likes

I solved it this way; it works, but it’s not perfect.

%compress/2 predicate
compress([], []).
compress([X], [X]).

compress([X, Y | T], [X | CompressedTail]) :-
    X + 1 =\= Y,
    compress([Y | T], CompressedTail).
    
compress([X, Y, Z| T], [X, Y| CompressedTail]) :-
    X + 1 =:= Y,
    Y + 1 =\= Z,
    compress([Z | T], CompressedTail).
    
    
compress([X, Y | T], [[X, YLast] | CompressedTail]) :-
    compress_consecutive([Y | T], YLast, Next),
    compress(Next, CompressedTail).

% Helper predicate to find the last element of a consecutive segment.
compress_consecutive([], Last, []) :- Last = [].
compress_consecutive([X], X, []) :- !.
compress_consecutive([X, Y | T], Last, Next) :-
    X + 1 =:= Y,
    compress_consecutive([Y | T], Last, Next).
    
    
compress_consecutive([X, Y | T], X, [Y | T]) :- 
X + 1 =\= Y.
1 Like

I solved it this way. Yeah, your code is advanced, and I shouldn’t use built-in predicates. I tried to solve it myself, and while it’s not perfect, it does work. Thank you for your assistance.

1 Like

I will retry tomorrow.

PS: I still think that the cut should be in the main predicate, namely compress/2, to prevent backtracking, not in the helper predicate though

1 Like

The “cut” (or if-then-else) should be closest to the source of non-determinism. If we look at

range(Start, Start) --> [Start].
range(Start, End) --> {End =:= Start + 1}, [Start, End].
range(Start, End) --> [range(Start,End)].

there must be cuts for the first two clauses because otherwise backtracking would result in

?- phrase(range(1, 1), R).
R = [1] ;
R = [range(1,1)].

which is incorrect. The correct code is:

range(Start, Start) --> !, [Start].
range(Start, End) --> {End =:= Start + 1}, !, [Start, End].
range(Start, End) --> [range(Start,End)].
2 Likes

I’ve posted an alternative method: Prolog: Compress a list that contains segments of continuous numbers to a list which contains list of range of segments [start,end] - Stack Overflow

… which uses Push-back lists on DCG rule heads

1 Like

Today I go back to search algorithms. Intellectual resources (as of other type) are always scarce and one has to focus on what matters the most. It was just a one-day trip and (in my case) not even that successful :sweat_smile:
Keep on posting such problems if you receive more of them in your classes. I am no programmer so I am not particularly focused on speed and performance (although I understand that it is an issue), but trying to solve the riddle might be entertaining

1 Like