Split list

Here’s something similar, as an example:

split_list_at_nth1(LongLst, Nth1Index, UptoLst, AfterLst) :-
    must_be(nonneg, Nth1Index),
    split_list_at_nth1_(LongLst, Nth1Index, UptoLst, AfterLst).

% Close the Upto loop
split_list_at_nth1_(L, 0, [], L) :- !.
split_list_at_nth1_([H|T], N, [H|UptoTail], AfterLst) :-
    succ(N0, N),
    split_list_at_nth1_(T, N0, UptoTail, AfterLst).
?- split_list_at_nth1([a, b, c, d], 2, U, A).
U = [a,b],
A = [c,d].

What about this (putting the index in front is what I would do).

split_list_at_nth1(Nth, Long, Start, End) :-
    length(Start, Nth),
    append(Start, End, Long).
1 Like

I have been wondering why this seems to be the convention. Given that a helper predicate (as in my example above) for list processing usually needs the list as the first argument, to take advantage of the [] vs [H|T] indexing, isn’t it a better convention to reduce such argument reordering?

It is easier to use partial evaluation *meta predicates. So you can write for example with nth1/3:

?- maplist(nth1(2), [[a, b, c], [e, f, g]], R).
R = [b, f].

(“Partial evaluation” is something else apparently)

1 Like

I think partial evaluation should be high order predicates, no? Or maybe meta predicates?

And yes, this is typically a reason to put keys/selectors, etc. at the front, leading to the order

x(action params, object to act upon, result).
2 Likes

You are correct. I thought that you can use meta predicates and the call mechanism for partial evaluation in Prolog but now that I read the Wikipedia page it turns out I was mixed up. How about partial application? Can we say “in Prolog, you can use call and meta-predicates to implement partial application”?

You can use append/2 for splitting into N pieces, trivially.

?- N = 2, length(R, N), append(R, [a,b]).
N = 2,
R = [[], [a, b]] ;
N = 2,
R = [[a], [b]] ;
N = 2,
R = [[a, b], []] ;
false.

For “pieces of size L” you might have to do the recursive step yourself.

Working code:

split_list_into_lens(Len, Lst, Lsts) :-
    must_be(positive_integer, Len),
    split_list_into_lens_(Lst, Len, Lsts).

split_list_into_lens_([], _, []).
split_list_into_lens_([H|T], Len, [LstSplit|Lsts]) :-
    (   length(LstSplit, Len),
        append(LstSplit, LstRemainder, [H|T]) -> true
    ;   LstSplit = [H|T], 
        length(LstSplit, LenSplitFinal),
        LenSplitFinal < Len,
        LstRemainder = [] ),
    split_list_into_lens_(LstRemainder, Len, Lsts).
?- split_list_into_lens(3, [a, b, c, d, e, f, g, h, i], Lsts).
Lsts = [[a,b,c],[d,e,f],[g,h,i]].

% Presumably desirable:
?- split_list_into_lens(3, [a, b, c, d, e, f, g, h, i, j, k], Lsts).
Lsts = [[a,b,c],[d,e,f],[g,h,i],[j,k]].
1 Like

I come up with this :

split([],_,_).
split(Lst,N,[FirstN|Res]) :-
 length(FirstN,N), append(FirstN,Rest,Lst), split(Rest,N,Res).

it almost works…

%%odd length i.e. the last split is smaller
?- L= [1,2,3,4,5],split(L,2,R).
false.

%% it should stop at 5,6
?- L= [1,2,3,4,5,6],split(L,2,R).
L = [1, 2, 3, 4, 5, 6],
R = [[1, 2], [3, 4], [5, 6]|_10990] ;
false.

What i’m missing


this works

split([],_,[]) :- !.
split(Lst,N,[Lst|Res]) :- length(Lst,X), X < N, split([],N,Res).
split(Lst,N,[FirstN|Res]) :- 
length(FirstN,N), append(FirstN,Rest,Lst), split(Rest,N,Res).

Note that your method is very inefficient, by checking the final situation (X < N) each time:

?- time((numlist(1, 100000, L), split(L, 3, S))).
% 466,673 inferences, 8.181 CPU in 8.126 seconds (101% CPU, 57046 Lips)
?- time((numlist(1, 100000, L), split_list_into_lens(3, L, S))).
% 366,685 inferences, 0.067 CPU in 0.066 seconds (101% CPU, 5511456 Lips)

Performance can be regained with a slight rewrite:

split2([],_,[]) :- !.
split2(Lst, N, [FirstN|Res]) :-
    length(FirstN, N),
    append(FirstN, Rest, Lst), !,
    split2(Rest, N, Res).
split2(Lst, N, Lst) :-
    length(Lst, Len),
    Len < N.
?- time((numlist(1, 100000, L), split2(L, 3, S))).
% 366,679 inferences, 0.074 CPU in 0.074 seconds (100% CPU, 4968454 Lips)
1 Like

nice… thanks
havent used time-ing in prolog

I see the third runs only once at the end …clever

Q: why from the ones below the 1st is ok but the second is not.


split(Lst, N, [FirstN|Res]) :- 
    ...... 
    split(Rest, N, Res).

split(Lst, N, Res) :- 
    ......
    split(Rest, N, [Res|FirstN]).

the head is a “match” … at the tail is a “call”… this is where concatenation should happen, right ?

That would be a concatenation in reverse order, though. A difference list can be used to “append” in the intended order.

Here is the same split implemented as a DCG, with basically the same performance as split_list_into_lens/3 (the cut increases performance):

% For string//1 at https://www.swi-prolog.org/pldoc/man?section=basics
:- use_module(library(dcg/basics)).

split_list(Len, Lst, LstSplit) :-
    must_be(positive_integer, Len),
    phrase(split(LstSplit, Len), Lst).

split([H|T], Len) --> split1(H, Len), !, split(T, Len).
% Terminate at end of list
split([], _) --> [].

% Grab list of intended length
split1(L, Len) --> { length(L, Len) }, string(L).
% ... or what remains at the end
split1(L, _Len) --> [L].
?- split_list(3, [a, b, c, d, e, f, g, h, i, j, k], Lsts).
Lsts = [[a,b,c],[d,e,f],[g,h,i],j,k].
1 Like

You could take advantage of the Rest argument of phrase/3; and there is also sequence//2 from library(dcg/high_order). It would be enough to define:

length_list(N, L) -->
    { length(L, N) },
    L. % yes don't need string//1

Then:

?- use_module(library(dcg/high_order)).
true.

?- phrase(sequence(length_list(2), Sublists), [a,b,c,d], Rest), !.
Sublists = [[a, b], [c, d]],
Rest = [].

?- phrase(sequence(length_list(2), Sublists), [a,b,c,d,e], Rest), !.
Sublists = [[a, b], [c, d]],
Rest = [e].

But of course it all depends on the use case.

EDIT: since there are some “likes” on this, I would like to point out two issues.

  1. As it stands, using L instead of string(L) is indeed a bit slower.
  2. Since sequence//2 leaves choice points, it will eventually run out of memory if the list we are parsing is long enough. Cutting on every matched prefix will avoid this.
3 Likes

I understand it is a reverse.
my question was what is the interpretation/thinking of putting it in the head.
In algorithming languages you always do it at the end, because they cant do “action” in the head.

I have hard time of understanding the logic/mechanism of how it works.

If I use L instead of string(L), then this increases from 4 to 11 seconds:

?- numlist(1, 10000000, L), time(split_list(3, L, S)).

Slightly more elegant than previous:

:- use_module(library(dcg/basics)).

split_list(Len, Lst, LstSplit) :-
    must_be(positive_integer, Len),
    phrase(split(LstSplit, Len), Lst).

split([H|T], Len) --> list_length(H, Len), !, split(T, Len).
% Terminate at end of list
split([H], _) --> [H], !.
split([], _) --> [].

list_length(L, Len) --> { length(L, Len) }, string(L).
1 Like

There is something complicated, the DCG rewrite rules invoke phrase/3

?- listing(split_list:length_list).
length_list(N, L, A, B) :-
    length(L, N),
    C=A,
    phrase(L, C, B).

instead of the simpler pattern I would like to see.

length_list(3, [A,B,C]) --> [A,B,C].

that is

?- listing(split_list:length_list).
length_list(3, [A, B, C], [A, B, C|D], D).

Precomputing the non terminal list_chunk//1 we can get back the efficiency of basic pattern matching:

:- dynamic list_chunk/3.

split_list(ListToBeSplitted, N, Sublists, Rest) :-
    retractall(list_chunk(_, _, _)),
    length(T, N),
    bind_last(T, A, D),
    assertz(list_chunk(T, A, D)),
    phrase(sequence(list_chunk, Sublists), ListToBeSplitted, Rest), !.

bind_last([Last], [Last|D], D).
bind_last([H|Tail], [H|Rest], D) :- bind_last(Tail, Rest, D).

I tried to use @jan’s split_list_at_nth1/4 solution above to split strings as follows:

split_string_at_nth1(Nth,S,Start,End):-
    string_codes(S,L),
    split_list_at_nth1(Nth,L,SL,EL),
    string_codes(Start,SL),
    string_codes(End,EL).

This worked in the forward direction but not in the backward. What is the right way to get it to work in both directions?
Thanks in advance.

This will do it, although it’s a bit ugly – by reordering the goals depending on whether S is a variable or not. (This can probably be done more elegantly using freeze/2 or when/2):

split_string_at_nth1(Nth,S,Start,End):-
    (   var(S)
    ->  string_codes(Start,SL),
        string_codes(End,EL),
        split_list_at_nth1(Nth,L,SL,EL),
        string_codes(S,L)
    ;   string_codes(S,L),
        split_list_at_nth1(Nth,L,SL,EL),
        string_codes(Start,SL),
        string_codes(End,EL)
    ).

split_list_at_nth1(Nth, Long, Start, End) :-
    length(Start, Nth),
    append(Start, End, Long).
?- split_string_at_nth1(3, "abcde", Start, End).
Start = "abc",
End = "de".

?- split_string_at_nth1(I, X, "abc", "de").
I = 3,
X = "abcde".
1 Like

Thank you very much. I am still figuring out the impact of goal ordering. This helped me a lot.

This goes off into infinity with:

?- split_list_at_nth1(N, [a,b,c], [a,b,c|T], End).
N = 3,
T = End, End = [] ;
ERROR: Stack limit (1.0Gb) exceeded

Here is a version which seems reasonable:

split_list_at_nth1(Nth1, Long, Start, End) :-
    (   nonvar(Nth1) -> must_be(nonneg, Nth1), Once = true
    ;   is_list(Long), (is_list(Start) ; is_list(End)) -> Once = true
    ;   is_list(End), is_list(Start) -> Once = true
    ;   true
    ),
    split_list_at_nth1_(Long, 0, Nth1, Once, Start, End).

split_list_at_nth1_(L, N, N, Once, [], L) :-
    (Once == true -> ! ; true).
split_list_at_nth1_([H|T], N, Nth1, Once, [H|Upto], End) :-
    N1 is N + 1,
    split_list_at_nth1_(T, N1, Nth1, Once, Upto, End).

… which produces:

?- split_list_at_nth1(N, [a,b,c], [a,b,c|T], End).
N = 3,
T = End, End = [].

This produces an unwanted choicepoint, which doesn’t seem worth the expected additional processing to guard against:

?- split_list_at_nth1(N, [a,b,c], St, [a,b,c|T]).
N = 0,
St = T, T = [] ;
false.

Edit: Removed unnecessary once.

1 Like