Use_module sometimes loads from current working directory

Thanks Jan. If I understand the above change correctly I think it will probably complicate things further. I cast my vote earlier for a warning instead, possibly with a flag as you suggest, although I thought that would be too much hassle.

An option would be to let the warning inform the user that the current behaviour is likely to raise errors, rather than warnings, in the future. That would give some time to users to prepare.

I personally would prefer that because I have a few repositories that may raise an error with the new changes and it will take me some time to go through all of them and update them. A few of those are repos of papers’ experiments and I know that academics don’t like it when they have to debug code shared for reproducibility purposes (I don’t mind). I am pretty sue I lose users because they find an error and never report it, and just give up.

Edit: btw, I think this is affecting me because I’m using modules extensively. It’s not unlikely that I’m in a minority in this. In discussions with others, I do seem to be a bit of an exception in how often I put my code in modules (i.e. always, without exception).

Using modules or not is unrelated. This is just about finding a file from a file being loaded, be it a module file or not. I may hope you are not an exception here. I barely ever write files that are not modules. Only for small test and demo files I make an exception and do not bother with the module header.

1 Like

Pushed 6f68a3b7917659c2543440d6645f4c3adad4cec1 to address this. It now prints a warning. Do not consider this the final answer, but do expect the old behaviour to be deprecated with a warning and to eventually disappear.

1 Like

Thank you Jan. This should give me time to fix my projects. Hopefully, others too.

Every system of “include” that I’ve used has had its problems. Simpler is better (I’ve long ago given up understanding Python’s “import”, packages vs modules, and __init__ and I suspect that most Python programmers are the same).
SWI-Prolog (from Quintus) does “include” (use_module) better than most, in that it usually hasn’t surprised me and also hasn’t required using incantations that I don’t understand.

The notion of cwd is somewhat meaningless when a program is started by clicking on an icon, so I think it’s best to not have cwd in file searches (e.g., should cwd be the user’s “home”, where the program is, or where the clicked file is?)
Maybe we need something like Python’s os.path.realpath(__file__), which gets the source file (but should the source file be the .pl or .qlf or .sh?)

Some of the built-in packages have this in their tests: user:file_search_path(library, .); but that depends on the testing system setting cwd, so it’s fragile – something like Python’s __file__ is probably better (e.g. some additional Prolog flags beyond resource_database or home). FWIW, Google’s “bazel” build/test system typically requires that the user specifies the equivalent of cwd as a parameter to the test program (e.g.: $(BINDIR), $(GENDIR), $(location label )), so that there’s no ambiguity.

Let’s not conflate opening a generic data file (which is what happens here with the python script IMO) versus loading a unit of code. Python looks in directories on sys.path for code, and . is definitely not a member by default. Perl traditionally did have . on its list (whose name I forget now) but I think that changed recently.


Ian

1 Like

Note that Prolog include/1 is fundamentally different as it does a textual include of the file, whereas use_module/1 created a new module and imports the exports of this module into the current compilation unit. This, as said, is unrelated to finding the file though, that works the same.

Agree. That is what @maren raised and I agreed with. It only turns out there is code around that uses cwd :frowning: My mistake when I was young and naive :frowning: On the other hand, using cwd was common practice in many systems and has been abandoned by most. In part for security reasons as it can be used to trick people into running code they did not expect.

That is absolute_file_name/3, no?

That can all be removed. They are left overs from before Cmake. It is testing only, so it is not dramatic.

That is Quintus/SWI prolog_load_context(file, File), no? You can only use this during compilation though, not at runtime. If you still want to know you either need to assert it during compilation or you can use source_file/2 of a predicate in the file. I tend to use the latter these days. It is fast and easy and I do not like asserta/1 from directives, also because you get more and more clauses when reloading.

Yes, I think this is how I understand it also. Maybe it’s idiosyncratic but if I’m working on the command line I never have any doubts what is the cwd. For example, if I cd to /usr and do an ls . I expect to see the contents of /usr. If I’m in doubt where I am there’s pwd. So in that case the cwd is clearly the path we could say the current terminal process is running from (even if that’s not what is really happening internally; at the user level, that’s what it looks like).

The problems begin if a process is started from a file called from another process. In that case there is no clear cwd: is it the file’s path, or the path of the process calling the file? I think that’s what you’re pointing out. But I don’t see how there can be an absolute, in the sense of objective, answer to that. @jan says above that most systems used to go one way, now they go the other way, so I guess it’s a kind of trend, like it’s a trend nowadays to “prefer composition over inheritance” in OOP circles (or has the trend changed yet?).

Btw, my original query, in the other post I linked, was about the different behaviour between ls and use_module/1 in SWI. It seems getting it right and consistent is going to be a big headache for @jan.

You are probably right, since I don’t use Python all that much and I’m not sure how it works in that sense. I left the Python example second in my comment because of my uncertainty about its default behaviour.

On that, it would be nice to have a clean_assert[az]/1 predicate or something like that, which asserts a term to the dynamic database but either first makes sure to retract any matching term, or fails with a (suppressible) warning if one is already present. As things stand now most of the time I wrap assert/1 and friends in a handler to do that, otherwise, like you say, it’s easy to end up with multiple copies of the same clause in the dynamic database.

More precisely, most of the time now when I manipulate the dynamic database I do it in a setup_call_cleanup/3 “frame” again just to be sure. That predicate is a godsend. The standard dynamic db manipulation predicates are too “raw” for safe use.

You might be interested in transactions. Notably, snapshot/1 ensures you can assert some things and have them automatically go away at the end.

?- asserta(foo(1)).
true.

?- findall(X, foo(X), Xs).
Xs = [1].

?- snapshot((
|    asserta(foo(2)),
|    findall(X, foo(X), Xs))).
Xs = [2, 1].

?- findall(X, foo(X), Xs).
Xs = [1].
2 Likes

It isn’t obvious what needs to be deleted though. All clauses of the same predicate? Clauses with a fully matching head? Clauses that produce the same answer? Something along these lines has crossed my mind many times, but I failed to find a good definition and I’m not aware of any established current practice. So, in the end you typically get a combination of retractall/1 and assert/1.

1 Like

Thanks for the tip. I’ve considered transactions before, but I was worried about portability. It seems though I’m committing more and more committed to SWI-Prolog the more time goes by so maybe I should get a second look.

That’s true, it’s dangerous to leave that to a default decision. Options could be added but that starts to look like an over-complicated predicate for something that should be simple. Still, I don’t like to ad-hoc it all the time.

So my current alternative is to check whether a more general version of a term is already in the database, and avoid writing the term in that case. But that, too can cause unexpected and hard-to-debug failures. Anyway here’s the code I use. The name comes from the fact that in Louise I mainly use it to write a learned program to the database for testing:

%!	assert_program(+Module,+Program,-Clause_References) is det.
%
assert_program(M,Ps,Rs):-
	assert_program(M,Ps,[],Rs).

assert_program(_,[],Rs,Rs):-
	!.
assert_program(M,[A|P],Acc,Bind):-
	copy_term(A,A_)
	,numbervars(A_)
	,clause(M:A_,true)
	,!
	,assert_program(M,P,Acc,Bind).
assert_program(M,[C|P],Acc,Bind):-
	copy_term(C,H:-B)
	,numbervars(H:-B)
	,clause(M:H,B)
	,!
	,assert_program(M,P,Acc,Bind).
assert_program(M,[C|P],Acc,Bind):-
	assert(M:C,Ref)
	,assert_program(M,P,[Ref|Acc],Bind).

Then I erase the clause references with a loop around erase/1 when I’m done. I call both predicates in a setup_call_cleanup/3 call, with assert_program/3 in the setup and erase/1 in the cleanup.

If you merely use them for simple cleanup, you can always go back to the classical way. If you use them to achieve isolation between threads, it becomes a different story. I guess a there are three levels wrt. portability

  • Is my code portable (ISO + common enough extensions)?
  • Can I quite easily make my code portable should this be necessary?
  • Is this feature hard to replace? Unfortunately, that holds for many both functional and non-functional features of SWI-Prolog. Several others can be replaced though. You can implement =>/2 (SSU) on top of ISO Prolog and even for dicts you can achieve something that is fairly compatible.
1 Like

Yep. Like I say, the concern of portability has become less and less easy to justify as I use SWI-Prolog more and more. Also, I’m lazy. I’ve considered porting Louise over to XSB to see if the tabling implementation is any different, but I can’t be bothered. I guess I will have to give transactions a go, once I have new code that doesn’t require me to rewrite and test everything again.

I think it’s always a good idea to have a portability layer for any kind of database manipulation in Prolog anyway.

I was hoping warnings can be disabled with:
:- set_prolog_flag(source_search_working_directory, true).

That would have been an option. My long term target is still to delete this completely though. Simple silencing the warning may make people believe this will stay. Of course, unless someone has a compelling (as in not compatibility, but some other setup that is hard to realise without searching CWD and that makes perfect sense) reason to keep this flag.

If you want to get rid of the warning, use user:message_hook/3.

1 Like

If you want to get rid of the warning, use user:message_hook/3.

Right on I agree.

What was going on is i am scanning for files that use my fake version of initialize/1 named now_and_later/1 that captures the absolute_file_name/3 when SWI-Prolog got started using a relative pathname to figure out how to convert a possible directory change so that when now_and_later/1 is ran durring restore I can compute how to resolve the directory to support the use of those rotten relative pathnames! Hah, yes I was trying to design a work around for the very problem you are creating warnings about.

1 Like