Integer wrap around semantics GNU [file system compat library]

Integer wrap around semantics GNU is probably gone.
Don’t know exactly. The change log for 1.6.0 has something:

2023-07-06 daniel.diaz@univ-paris1.fr
improve other arithmetic error detection (in both integer and float functions
fix issue #47: add integer overflow detection
https://github.com/didoudiaz/gprolog/blob/8ce915bcb85c27c84645d6cff4108973e4fb2acd/ChangeLog

So I guess you don’t need to emulate wrap around integers,
if you would provide more GNU support. And the analogue to
a library(file_system) for SICStus prolog compatibility like here:

file_systems.pl – SICStus 4 library(file_systems)
https://www.swi-prolog.org/pldoc/doc/SWI/library/dialect/sicstus4/file_systems.pl

Would possibly also not be a full emulation. Only the file
system predicates from GNU would land in some SWI-Prolog library.
But I didn’t yet manage to compile GNU 1.6.0 so that it has

same speed as GNU 1.5.0 or GNU 1.4.5. So I am using still
GNU 1.5.0 or GNU 1.4.5 whenever I am refering to GNU. Not
yet the new GNU 1.6.0, which I am too stupid to compile correctly

Also I wonder whether this is a good idea, from SWI-Prolog
source. Since many operating systems have mkdirs() natively.
But here its bootstrapped:

make_directory_path_2(Dir) :-
    Dir \== (/),
    !,
    file_directory_name(Dir, Parent),
    make_directory_path_2(Parent),
    E = error(existence_error(directory, _), _),
    catch(make_directory(Dir), E,
          (   exists_directory(Dir)
          ->  true
          ;   throw(E)
          )).

https://www.swi-prolog.org/pldoc/doc/SWI/library/filesex.pl?show=src#make_directory_path/1

It is possibly not an atomic operation. Not sure what operating
system implementations of mkdirs() say, i.e. for example the
recursive=true parameter in JavaScript. And catch/3 might be slow.

In Java, if you use File Java class, and write something
with exists() method, its not that slow. Since the File Java class,
behind the scene, has theoretically the opportunity to do some

caching. But if you have string/atom paths, there is the dilemma
that this string/atom is not an object, that could do some caching.
You would need some separate cache, which then might lack

garbage collection. So that the scenario of creating millions of
directories can really become a problem.

Edit 20.11.2023¨
make_directory_path/1 is not only slow because of catch/3,
also because of the recursive call make_directory_path_2/1.
Its left recursive! So ensure_directory/1 is faster if you have

somewhere an invariant that the parent directory already exists. But
make_directory_path/2 as implemented as above is always slower
than ensure_directory/1, provided file_exists/1 isn’t that slow,

since make_directory_path/2 implemented with left recursion
has effort O(n) where n is the number of directory segments. But
ensure_directory/1 has only effort O(1). You also don’t polute

the Prolog system with new atoms from file_directory_name/2.

Why do I even look at make_directory_path/2, the source
in SWI-Prolog? Well it could be a thingy for the other direction.
Not a flow from predicate ideas from GNU to SWI, but vice

versa from SWI to GNU. Namely deliver a SWI-Prolog compatiblity
library for inside GNU Prolog. As of this writing it seems GNU Prolog
doesn’t have any make_directory_path/2 predicate.

Neither do my Prolog systems have such a predicate. But it
could have some use case. I also like the suggestion by brebs:

If you give a parent directory, you could also give this
parent directory to a make_directory_path/2 not only to a
make_directory/2. It would say when to stop. For example

I wonder if mkdir -p, when issue as follows with a relative
path instead of an absolute path, tries to ensure the current directory?

# mkdir -p foo/bar

A fast implementation of mkdir would stop at the current working
directory, this is anyway assumed to exist? A left recursive implementation
as done in SWI-Prolog does not stop at the current working directory.

AFAIK, it is not part of POSIX. The POSIX primitive is mkdir(), which takes a path name (and mode) and adds a single directory to the filesystem. It fails with EEXIST if there is something there (either a directory or some other entry) or ENOENT if the parent inferred from the given pathname does not exist and ENOTDIR if the inferred parent is not a directory (and a bunch of permission, representation, resource, etc. errors).

So, at least on POSIX, the way to ensure a directory is

  • [optionally] check it exists. If so, done
  • Create it. If no error, done. If error, check that it exists. If so, done.

The first could be omitted. The error-checking create is required as even if the directory did not exist, it may be created between the two calls. The second could only check for existence in the EEXIST error. That is precisely what the SWI-Prolog library does.

The more interesting question is what should a high level language that attempts to be as OS neutral as possible provide. It would be a great topic for the PIP initiative started at the ICLP in London. Might turn out to be hard. Luckily the number of fundamentally different file systems that are still relevant have decreased a bit and we can learn from many other languages.

I’d suggest simply doing whatever Python does with os.makedirs and os.mkdir.

That is surely a good starting point. Would be nice to have agreement between at least a number of important Prolog systems … Many systems seem to have something that is pretty usable, but no two systems agree on even the basic names, basic semantics and error handling. As acting on errors is not uncommon, some agreement would be desirable. Possibly even better is to define more high level primitives that do not require acting on errors (such as ensuring a directory exists).

The reason I suggested this is that Python has a pretty good process for proposing designs and reviewing the (PEPs, etc.), so typically a lot of thought has gone into the various APIs. (os.makedirs and os.mkdir are fairly old, so they might not have had as much review)

Agreed that higher-level APIs would be good, such as ensure_directory/1, but even there, some kind of error handling needs to be defined (e.g. ensure_dictionary('/foo/bar) would typically be an error), and details of the access modes would also need to be defined. (AFAICT, ensure_directory(foo, 0o777) would be the same as Python’s os.makedirs('foo', mode=0o777, exist_ok=True), so we could use the usual SWI-Prolog conventions for options, e.g. `makedirs(“foo”, [mode(0o777), exist_ok(true)]).