In Quasi-quotations, again , @Jan announced a new library(strings)
, defining dedent_string/3
, interpolate_string/4
, string/4
. I decided to port the Python test cases (because there are a lot of corner cases) and in the process discovered some subtle issues.
(Prologue: I’m proposing we follow Python’s semantics for some things because there’s a lot of practical experience there, and there’s a lot of discussion over any new feature.)
Let’s start with splitlines
, which isn’t yet in the library, but probably should be (and should be spelled split_lines
). The related Python function is str.splitlines([
keepends])
, which allows a much larger list of line separators than just "\n"
. In addition, with the keepends argument, the caller can control whether or not to keep the end-of-line characters.
The Python splitlines
method returns an empty list for the empty string, and a terminal line break does not result in an extra line:
>>> "".splitlines()
[]
>>> "\n".splitlines()
[""]
>>> "One line\n".splitlines()
["One line"]
On the other hand, split("\n")
– which is essentially the same as SWI-Prolog’s split_string(Str, "\n", "", Lines)
– gives:
>>> "".split("\n")
[""]
>>> "\n".split("\n")
["", ""]
>>> "Two lines\n".split("\n")
["Two lines", ""]
Because dedent_string
and indent_string
split the string into lines, it’s important that we decide how to do this splitting. I propose that we add split_lines/2
and split_lines_with_ends/2
predicates and use those in dedent_string
and indent_string
.
There is an additional item with dedent/indent – how to handle empty lines (that is lines, with only whitespace). I propose following the Python semantics: ignoring empty lines when determining the dedent and of not indenting them (the latter can be changed by specifying a predicate that controls which lines are indented). For dedent, my experience is that this works nicely with multi-line strings in source code.