In Quasi-quotations, again , @Jan announced a new library(strings), defining dedent_string/3, interpolate_string/4, string/4. I decided to port the Python test cases (because there are a lot of corner cases) and in the process discovered some subtle issues.
(Prologue: I’m proposing we follow Python’s semantics for some things because there’s a lot of practical experience there, and there’s a lot of discussion over any new feature.)
Let’s start with splitlines, which isn’t yet in the library, but probably should be (and should be spelled split_lines). The related Python function is str.splitlines([keepends]), which allows a much larger list of line separators than just "\n". In addition, with the keepends argument, the caller can control whether or not to keep the end-of-line characters.
The Python splitlines method returns an empty list for the empty string, and a terminal line break does not result in an extra line:
>>> "".splitlines()
[]
>>> "\n".splitlines()
[""]
>>> "One line\n".splitlines()
["One line"]
On the other hand, split("\n") – which is essentially the same as SWI-Prolog’s split_string(Str, "\n", "", Lines) – gives:
>>> "".split("\n")
[""]
>>> "\n".split("\n")
["", ""]
>>> "Two lines\n".split("\n")
["Two lines", ""]
Because dedent_string and indent_string split the string into lines, it’s important that we decide how to do this splitting. I propose that we add split_lines/2 and split_lines_with_ends/2 predicates and use those in dedent_string and indent_string.
There is an additional item with dedent/indent – how to handle empty lines (that is lines, with only whitespace). I propose following the Python semantics: ignoring empty lines when determining the dedent and of not indenting them (the latter can be changed by specifying a predicate that controls which lines are indented). For dedent, my experience is that this works nicely with multi-line strings in source code.
. SWI-Prolog follows the Unix perspective: internally all is single \n. The ISO syntax allows creating an atom/string with a \r, but you should normally not do so. Only if you exchange text as byte sequences you must start thinking about that, but that is normally only done when talking to some external device.
… someone else can write the additional code and test cases, if it’s important to them.