Careful with sub_atom/5 and feature request last_sub_atom/5

If your regular expression library would then provide a new backward
search option? There are only a few regex that can do that. Currently right now,
Java cannot do it :crazy_face:. But for example .NET can do it:

Gets a value that indicates whether the regular expression searches from right to left.
https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.righttoleft?view=net-6.0

Peter Ludemann has already shown how to solve the problem with
left to right search. But right to left seems not to be available:

But this is rather clumsy and not so fast. Whats also a funny idea, is to use
atomic_list_concat/3. I get the following, which is also not extremly efficient
and puts burden on the atom table:

?- atomic_list_concat(L, '/', 'foo/bar/baz.p'), last(L, X).
X = 'baz.p'.

Whats also lacking in the above version is searching for '/' and '\\'
at the same time. Ok, one could do the following:

dos2unix(A, B) :-
    atomic_list_concat(L, '\\', A),
    atomic_list_concat(L, '/', B).

?- dos2unix('foo\\bar\\baz.p', B), atomic_list_concat(L, '/', B), last(L, X).
X = 'baz.p'.

Originally I was doing the dos2unix/2 thingy in my new system, before
I had last_sub_atom/5. Now I am doing it more directly. I don’t have issues
with atom table, but issues remain the same with or without an atom table,

the dos2unix/2 thingy creates a lot of objects, making pressure on the
garbage collector of the host programming language, which handles them
in my new system. Maybe a Prolog system allocates the strings in its

environment, but this then gives pressure on environment trimming. But
what ultimately let me dismiss dos2unix, and going for OS Polyglott, is
the problem of retrieving the directory part, without normalizing the file

path into either dos or unix. For this last_sub_atom/5 is ultra handy.

It is not clear to me what we are solving here. If it is just path manipulation, SWI-Prolog has high level predicates for that. Just combine prolog_to_os_filename/2 and file_base_name/2. The typical application quickly maps file names to the POSIX convention and only back to the OS convention when passing as argument to external programs or presenting them to the user. That way you do not need to worry about these issues everywhere.

sub_atom/5 can search, but indeed in a rather limited way. I have my doubt whether blowing it up to much more flexible search is such a great idea.

Another hack is this. Does 1M of them in 0.893 sec. file_base_name/2 does it in 0.568 sec (without mapping DOS to POSIX first).

split_string(Input, "\\/", "", Parts),
last(Parts, FileS),
atom_string(File, FileS).

Now node.js tells me that its a win32 thingy?
The path.posix repertoire works like SWI-Prolog:

> path = require('path');
> path.posix.dirname('foo/bar/baz.p')
'foo/bar'
> path.posix.dirname('foo\\bar\\baz.p')
'.'

The path.win32 repertoire does the OS polyglot thingy:

> path.win32.dirname('foo/bar/baz.p')
'foo/bar'
> path.win32.dirname('foo\\bar\\baz.p')
'foo\\bar'

Edit 01.06.2022:
There is quite some OO and circular data structure magic going on. But
we don’t need Prolog rational term unification, the JavaScript (===)/2
just compares the address? On Windows I get:

> path === path.win32
true

And on WSL I get:

> path === path.win32
false

If node.js does it the way Python does it, there are both win32 (or NT) and posix versions of the path manipulation functions, and on a particular platform, the defaults are assigned appropriately. This is what I get with a Linux version of Python:

$ python3
Python 3.12.0a0 (heads/main:7e46ae33bd, May 13 2022, 12:25:27) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.sep
'/'
>>> os.path
<module 'posixpath' (frozen)>

>>> import ntpath                                                                                            
>>> ntpath.sep                                                                                               
'\\' 

>>> import posixpath
>>> posixpath.sep
'/'

It should be fairly easy to add a “reversed” option to library(pcre) and probably pretty efficient (in C). But it could be a bit tedious to make this change, sprinkling “if-reversed” tests throughout the code.

There is a “lookbehind” pattern that might do some of what you want: pcre2pattern specification