Symbolic links and real file names [mainly Windows Platform]

Don’t you register sources with real path? Maybe then the problem
goes away. For example GNU Prolog has a predicate file_property/2
which can give you the real path:

file_property(+atom, ?os_file_property)
real_file_name(File): File is the real file name of PathName (follows symbolic links).
http://www.gprolog.org/manual/gprolog.html#file-property%2F2

Some weeks ago was looking for a file_property/2 predicate in SWI-Prolog,
but couldn’t find one. There was only SICStus 4 library(file_systems) which
doesn’t provide real file name in their file_property/2 predicate.

The real_file_name/1 predicate is a little nasty, since to follow
links the operating system or the library that provides the predicate,
needs to read links, and therefore might have an I/O exception

as a result of its execution.

absolute_file_name/3 seems to be the equivalent. There are various options, including checking whether you have appropriate access to the file, whether you want to check for the file existing (and failure or error if the file doesn’t exist), and the first possibility or all possibilities,. See also access_file/2. Other file access predicates are in the file section of the manual and library(filesex).

Not necessarely, absolute_file_name/3 might only cover this property:

  • absolute_file_name(File): File is the absolute file name of PathName (section 8.26.1).

And not this property of GNU-Prologs file_property/2:

  • real_file_name(File): File is the real file name of PathName (follows symbolic links).

These are two different concepts in operating systems.

Edit 13.10.2023
The documentation of SWI-Prolog absolute_file_name/3 has also this remark:

Various
Note that this predicate does not resolve symlinks to their actual canonical path.

For example on WSL2 SWI-Prolog absolute_file_name/2 doesn’t follow
any links. You can try yourself this easy scenario:

$ ls -la foo
f4.pl -> ../bar/f3.pl
$ ls -la bar
f3.pl

And now SWI-Prolog gives me:

$ swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 8.4.2)
?- absolute_file_name('foo/f4.pl', X).
X = '/home/user/foo/f4.pl'.

The notion of “realpath” is rather complicated on Unix systems. Ok, we can distinguish symbolic links from real files, but real files may appear in multiple places in the filesystem due to hard links as well as mounts. Now less popular, but in the old days automounting (home) directories was quite popular, which made your home directory appear on different “real” paths on different machines while the home was always accessible as /home/user.

SWI-Prolog use realpath on Windows. On other systems it uses a table of canonical directories. Once it decided a directory has a certain path, it will return this same path for any directory that is the same physical directory (identified by the POSIX device and inode). At least, that is the idea. I’m not 100$ sure the logic is always correct, but it seems to work well in practice.

GNU Prolog fails to follow a file symlink on Windows, I get for this scenario:

>dir foo
<SYMLINK>      f4.pl [..\bar\f3.pl]

That GNU Prolog only returns:

/* GNU Prolog 1.5.0 */
?- file_property('C:\\<dir>\\foo\\f4.pl', X).
X = real_file_name('C:/<dir>/foo/f4.pl') ? 

Same on SWI-Prolog it only returns:

/* SWI-Prolog 9.1.16 */
?- absolute_file_name('C:\\<dir>\\foo\\f4.pl', X, [access(read)]).
X = 'c:/<dir>/foo/f4.pl'.

I guess this also implies that SWI-Prolog uses lstat? Or does it use stat?

Edit 14.10.2023
For example nodeJS does a better job on Windows, better than GNU Prolog
SWI-Prolog, in that I get, using the file system function realpathSync():

/* Node.js v20.7.0 */
> fs.realpathSync("C:\\<dir>\\foo\\f4.pl");
'C:\\<dir>\\bar\\f3.pl'

But nodeJS is also a little weak. For example compared to Java, it cannot
produce ignore case replacements. So if a file has name F5.pl in the filesystem,
and if we lookup f5.pl, Java can produce F5.pl whereas nodeJS cannot.

Moved as this is barely related to the original topic. For what it is worth, SWI-Prolog

  • Uses GetFullPathName() on Windows. I cannot test symbolic links as my VM runs Windows 11 home and it seems you cannot enable symbolic linking on Windows home.
  • On POSIX systems, it leaves the file name alone and walks up the directory hierarchy until it finds a directory (based on device and inode) that is considered “canonical”. It then registers all directories below the known canonical one into the canonical table. This does avoid confusion in a number of scenarios, but surely not all.
  • it provides same_file/2 to check that two file entries are the same physical thing. That uses dev/inode on POSIX and something comparable on more recent Windows versions (and GetFullPathName() otherwise).