Split list

To split a list: splitlist(ListToBeSplitted, Splititem, BeforeSplit, AfterSplit)

splitlist([H|R], H, [], R).
splitlist([H|R], X, [H|S2],E) :-
    splitlist(R, X, S2, E).

Or you can just use the built-in append/3:

append(BeforeSplit, [SplitItem|AfterSplit], ListToBeSplit)

what about splitting a list into N pieces, or pieces of size L

Here’s something similar, as an example:

split_list_at_nth1(LongLst, Nth1Index, UptoLst, AfterLst) :-
    must_be(nonneg, Nth1Index),
    split_list_at_nth1_(LongLst, Nth1Index, UptoLst, AfterLst).

% Close the Upto loop
split_list_at_nth1_(L, 0, [], L) :- !.
split_list_at_nth1_([H|T], N, [H|UptoTail], AfterLst) :-
    succ(N0, N),
    split_list_at_nth1_(T, N0, UptoTail, AfterLst).
?- split_list_at_nth1([a, b, c, d], 2, U, A).
U = [a,b],
A = [c,d].

What about this (putting the index in front is what I would do).

split_list_at_nth1(Nth, Long, Start, End) :-
    length(Start, Nth),
    append(Start, End, Long).
1 Like

I have been wondering why this seems to be the convention. Given that a helper predicate (as in my example above) for list processing usually needs the list as the first argument, to take advantage of the [] vs [H|T] indexing, isn’t it a better convention to reduce such argument reordering?

It is easier to use partial evaluation *meta predicates. So you can write for example with nth1/3:

?- maplist(nth1(2), [[a, b, c], [e, f, g]], R).
R = [b, f].

(“Partial evaluation” is something else apparently)

1 Like

I think partial evaluation should be high order predicates, no? Or maybe meta predicates?

And yes, this is typically a reason to put keys/selectors, etc. at the front, leading to the order

x(action params, object to act upon, result).

You are correct. I thought that you can use meta predicates and the call mechanism for partial evaluation in Prolog but now that I read the Wikipedia page it turns out I was mixed up. How about partial application? Can we say “in Prolog, you can use call and meta-predicates to implement partial application”?

You can use append/2 for splitting into N pieces, trivially.

?- N = 2, length(R, N), append(R, [a,b]).
N = 2,
R = [[], [a, b]] ;
N = 2,
R = [[a], [b]] ;
N = 2,
R = [[a, b], []] ;

For “pieces of size L” you might have to do the recursive step yourself.

Working code:

split_list_into_lens(Len, Lst, Lsts) :-
    must_be(positive_integer, Len),
    split_list_into_lens_(Lst, Len, Lsts).

split_list_into_lens_([], _, []).
split_list_into_lens_([H|T], Len, [LstSplit|Lsts]) :-
    (   length(LstSplit, Len),
        append(LstSplit, LstRemainder, [H|T]) -> true
    ;   LstSplit = [H|T], 
        length(LstSplit, LenSplitFinal),
        LenSplitFinal < Len,
        LstRemainder = [] ),
    split_list_into_lens_(LstRemainder, Len, Lsts).
?- split_list_into_lens(3, [a, b, c, d, e, f, g, h, i], Lsts).
Lsts = [[a,b,c],[d,e,f],[g,h,i]].

% Presumably desirable:
?- split_list_into_lens(3, [a, b, c, d, e, f, g, h, i, j, k], Lsts).
Lsts = [[a,b,c],[d,e,f],[g,h,i],[j,k]].
1 Like

I come up with this :

split(Lst,N,[FirstN|Res]) :-
 length(FirstN,N), append(FirstN,Rest,Lst), split(Rest,N,Res).

it almost works…

%%odd length i.e. the last split is smaller
?- L= [1,2,3,4,5],split(L,2,R).

%% it should stop at 5,6
?- L= [1,2,3,4,5,6],split(L,2,R).
L = [1, 2, 3, 4, 5, 6],
R = [[1, 2], [3, 4], [5, 6]|_10990] ;

What i’m missing

this works

split([],_,[]) :- !.
split(Lst,N,[Lst|Res]) :- length(Lst,X), X < N, split([],N,Res).
split(Lst,N,[FirstN|Res]) :- 
length(FirstN,N), append(FirstN,Rest,Lst), split(Rest,N,Res).

Note that your method is very inefficient, by checking the final situation (X < N) each time:

?- time((numlist(1, 100000, L), split(L, 3, S))).
% 466,673 inferences, 8.181 CPU in 8.126 seconds (101% CPU, 57046 Lips)
?- time((numlist(1, 100000, L), split_list_into_lens(3, L, S))).
% 366,685 inferences, 0.067 CPU in 0.066 seconds (101% CPU, 5511456 Lips)

Performance can be regained with a slight rewrite:

split2([],_,[]) :- !.
split2(Lst, N, [FirstN|Res]) :-
    length(FirstN, N),
    append(FirstN, Rest, Lst), !,
    split2(Rest, N, Res).
split2(Lst, N, Lst) :-
    length(Lst, Len),
    Len < N.
?- time((numlist(1, 100000, L), split2(L, 3, S))).
% 366,679 inferences, 0.074 CPU in 0.074 seconds (100% CPU, 4968454 Lips)
1 Like

nice… thanks
havent used time-ing in prolog

I see the third runs only once at the end …clever

Q: why from the ones below the 1st is ok but the second is not.

split(Lst, N, [FirstN|Res]) :- 
    split(Rest, N, Res).

split(Lst, N, Res) :- 
    split(Rest, N, [Res|FirstN]).

the head is a “match” … at the tail is a “call”… this is where concatenation should happen, right ?

That would be a concatenation in reverse order, though. A difference list can be used to “append” in the intended order.

Here is the same split implemented as a DCG, with basically the same performance as split_list_into_lens/3 (the cut increases performance):

% For string//1 at https://www.swi-prolog.org/pldoc/man?section=basics
:- use_module(library(dcg/basics)).

split_list(Len, Lst, LstSplit) :-
    must_be(positive_integer, Len),
    phrase(split(LstSplit, Len), Lst).

split([H|T], Len) --> split1(H, Len), !, split(T, Len).
% Terminate at end of list
split([], _) --> [].

% Grab list of intended length
split1(L, Len) --> { length(L, Len) }, string(L).
% ... or what remains at the end
split1(L, _Len) --> [L].
?- split_list(3, [a, b, c, d, e, f, g, h, i, j, k], Lsts).
Lsts = [[a,b,c],[d,e,f],[g,h,i],j,k].
1 Like

You could take advantage of the Rest argument of phrase/3; and there is also sequence//2 from library(dcg/high_order). It would be enough to define:

length_list(N, L) -->
    { length(L, N) },
    L. % yes don't need string//1


?- use_module(library(dcg/high_order)).

?- phrase(sequence(length_list(2), Sublists), [a,b,c,d], Rest), !.
Sublists = [[a, b], [c, d]],
Rest = [].

?- phrase(sequence(length_list(2), Sublists), [a,b,c,d,e], Rest), !.
Sublists = [[a, b], [c, d]],
Rest = [e].

But of course it all depends on the use case.

EDIT: since there are some “likes” on this, I would like to point out two issues.

  1. As it stands, using L instead of string(L) is indeed a bit slower.
  2. Since sequence//2 leaves choice points, it will eventually run out of memory if the list we are parsing is long enough. Cutting on every matched prefix will avoid this.

I understand it is a reverse.
my question was what is the interpretation/thinking of putting it in the head.
In algorithming languages you always do it at the end, because they cant do “action” in the head.

I have hard time of understanding the logic/mechanism of how it works.

If I use L instead of string(L), then this increases from 4 to 11 seconds:

?- numlist(1, 10000000, L), time(split_list(3, L, S)).

Slightly more elegant than previous:

:- use_module(library(dcg/basics)).

split_list(Len, Lst, LstSplit) :-
    must_be(positive_integer, Len),
    phrase(split(LstSplit, Len), Lst).

split([H|T], Len) --> list_length(H, Len), !, split(T, Len).
% Terminate at end of list
split([H], _) --> [H], !.
split([], _) --> [].

list_length(L, Len) --> { length(L, Len) }, string(L).
1 Like

There is something complicated, the DCG rewrite rules invoke phrase/3

?- listing(split_list:length_list).
length_list(N, L, A, B) :-
    length(L, N),
    phrase(L, C, B).

instead of the simpler pattern I would like to see.

length_list(3, [A,B,C]) --> [A,B,C].

that is

?- listing(split_list:length_list).
length_list(3, [A, B, C], [A, B, C|D], D).

Precomputing the non terminal list_chunk//1 we can get back the efficiency of basic pattern matching:

:- dynamic list_chunk/3.

split_list(ListToBeSplitted, N, Sublists, Rest) :-
    retractall(list_chunk(_, _, _)),
    length(T, N),
    bind_last(T, A, D),
    assertz(list_chunk(T, A, D)),
    phrase(sequence(list_chunk, Sublists), ListToBeSplitted, Rest), !.

bind_last([Last], [Last|D], D).
bind_last([H|Tail], [H|Rest], D) :- bind_last(Tail, Rest, D).