Autoload and qsave (swipl -c) don't go together

I’m using: SWI-Prolog version 8.1.21 for Linux

When I add :- set_prolog_flag(autoload, false). I get this:

$ /usr/bin/swipl --goal=pykythe_main --stand_alone=true --undefined=error --verbose=false \
    --foreign=save \
    -o /tmp/pykythe_test/pykythe.qlf -c pykythe/pykythe.pl
ERROR: qsave:qsave_toplevel/0: Unknown procedure: qsave_program/2

Without the autoload directive, it compiles fine. (It also runs fine with autoload when I don’t try to use a .qlf file.)
I think that this used to work, so has something changed recently?

Also, it appears that the autoload flag is not restricted to a module but has global effect. Is this behaviour as intended?

As we have seen before, disabling autoloading without loading everything explicitly most cripples the system. Don’t. I don’t think anything has changed, but it may have. There are no tests to verify behavior without autoloading.

That is intentional. Only the unknown flag is module specific. You can use that to make undefined predicates fail (and not try to autoload) in some module. I don’t think I ever used it …

Is there any reason to have the autoload flag, then? :wink:

(I volunteer to go through the library and add missing use_module directives; but that does put a burden on Jan to review the changes.)

I also do not want that. It causes everything to be loaded upfront, which slows down startup and makes programs unnecessarily big. What bothers you using the auto loader? The cross-referencer, PceEmacs and library(check) can tell you what will be autoloaded from where.

Eventually I would like to see something like to see use_module/1,2 not actually loading the file but instead use dynamic loading to load the code when needed. The main problem with this is that modules not only provide predicates, but also modify the environment by adding hooks. Some systems give up this flexibility but that comes with a lot of extra boiler plate. One of the options might be to have autoload directive with similar syntax than use_module/1,2 as a directive that tells the system “If you ever need this predicate, load if from this file”. That gives full control by the user, doesn’t harm image size and startup time and makes creating stand-alone programs easier as that would simply imply handling autoload/1,2 with the same semantics as use_module/1,2.

Been there :stuck_out_tongue: Jan at the time also didn’t want those changes for the reasons he explained in his reply.

Personally, I strongly dislike the autoloading mechanism and the single reason I don’t turn it off is that SWI-Prolog doesn’t even start without it enabled. Try e.g.

$ swipl -g "set_prolog_flag(autoload, false)"

Why the dislike? Although the autoloading mechanism saves typing and makes the source code a bit smaller, it strongly reinforces the idea of a single namespace for all predicates, independently of those predicates being encapsulated in modules. In fact and in practice, it imposes that idea. It works with the semantics of the use_module/1 directive. But this directive is flawed in that a simply addition of a new predicate to a module, that an existing application necessarily don’t use, can break the application.

These potential issues are avoided by programmers attempting to ensure that their exported predicates don’t clash with the exported predicates of other modules. This is inherently non-scalable. It also forces programmers to come out with alternative names instead of using the best possible names. E.g. the lists module has the exclusive of the member/2 predicate name. Everyone else, back off. Or complain loudly that an existing library module exports a predicate that a new library should have the right of use. That happens. E.g. it’s why ugraphs:transpose/2 become ugraphs:transpose_ugraph/2 so that we could have clpfd:transpose/2 :man_facepalming:t2:

It also results in programmers, newbies and not so newbies, having no idea of the dependencies of their applications (despite of the availability of tools, as Jan remarked, that can list those dependencies; but tools are only useful when programmers use them…). They often have no idea if they are using a built-in predicate, a library predicate, a deprecated predicate, a core library predicate, a third-party predicate, a foreign predicate (that may not behave and even be available on a different operating-system), …

Again, tools and documentation are available but are also too frequently ignored. More conscious programmers will of course use them. Still, as good as tools and documentation may be, the practical consequence is that of a single predicate namespace. To be fair, the root of these issues is actually the module system. But the autoloading mechanism strongly reinforces those negative traits.

Admittedly, these issues are more of a concern in large applications (and by large I mean applications where a single person would have trouble, if able at all, to keep all details in his/her head to avoid these issues). But that only leads to question the validity of mechanisms presented as solutions for programming in the large in the first place.

P.S. Take this post as reflection points, not as an attack on SWI-Prolog design choices.

2 Likes

The tools don’t work very well for me, partly because of my use of EDCGs and rdet (and Emacs).

Is there a good tool for xrefs in Emacs? When I use etags -l prolog, it doesn’t know about module declarations; so, for example append shows up with both library(lists) and library(rdb_diagram). This is, I suppose, a minor annoyance; but there’s a larger problem in that etags doesn’t do term expansion, so if there are any generated predicates, they don’t show in the xref.

As an example: maplist([S,A]>>atom_string(A,S), SubStrings, SubAtoms) would be puzzling to someone who doesn’t know about library(yapf).

1 Like

The original decision was made by Quintus; De mortuis nihil nisi bonum.

At the time, it seemed like a reasonable choice; but I’ve come to dislike it.

100% agree.

As another example, Python gives a choice between imports that modify the name space without any control (using from foo import *) and keeping the name spaces separate and explicit (import foo and from foo import a,b,c). The Google coding standards disallow modifying the name space (from foo import *) because it has proved to be too difficult to manage in a large codebase, even with sophisticated tools. By specifying use_module with an explicit import list, I can be sure that every unqualified name is defined somewhere in the source file (use_module is a kind of definition).

Anyway, I don’t want to impose my views on anyone; I merely want to be able to use my preferred style, which is to be explicit rather than implicit.

3 Likes

A possible option could be to use existing xref tools to write the tags file. As the tools act on the code after term-expansion, we would get a complete view.

2 Likes

Having had the pleasure to work with a Prolog system where you need to import everything explicitly I enjoy every minute I use SWI-Prolog twice as much :slight_smile: Apart from the fact that Prolog system tend to have different library structures and therefore you need a lot of :-if() to import your predicates from the right place depending on the Prolog system.

The debate is everywhere, even to the level of dynamic linking where the Windows/MacOS XCOFF binary demands a symbol to come from a specific library and most of the rest of the Unix/Linux world uses ELF which uses a global symbol table. Not that many people claim you can’t write large programs for Linux while any compiled language that uses the native binary format eventually is caught by the linker semantics.

IMO, you have tools for that. If you have a project big enough you create coding guidelines and setup CI. SWI-Prolog ships with plenty tools to help you write a program to verify these coding guidelines, by examining the source and/or compiled program.

@pmoura’s idea to write a tag file from the project compiled representation could be quite neat. Alternatively there is the discussion about the language server protocol. Can’t find it right now, but that could be a good solution as well.

Self reply is bad, but I would like to bring this to the attention again. The proposal would be:

  • Add :- autoload(File) and :- autoload(File, Imports). A different name can be discussed. Syntax and semantics are fully equivalent to :- use_module, except no loading happens (see below for exceptions). The directive adds (for example) a $autoload'(File, [Imports]) fact to the module.
  • Have an autoload flag with three values
    • true
      does what happens now, but with a small change. If a predicate is undefined, first see whether there is an autoload declaration in the target module. If so, load that file. If not, do the current autoload library search. This gives the same experience as the current system with some additional control to reduce (IMO mostly theoretical) conflicts.
    • explicit (better name?)
      only checks the explicit autoload declarations. This gives the full control that some people want with the small footprint and fast startup that autoloading provides.
    • false
      Handle :- autoload directives as :- use_module at compile time and disable all autoloading at runtime. This is probably mostly suitable for creating stand-alone programs.

I think the implementation of this should be nearly trivial, as in only somewhat longer than this post :slight_smile:

1 Like

Except when you use portable libraries. Those do exists (notably, in Logtalk). But indeed the Prolog world is a poster child of balkanization in programming languages. Porting code, in detriment of writing portable code, is, in the long run, just threading water.

That’s misleading and have little to do with what I wrote. The debate is not (and was never) about being able to write large programs using either philosophy but about the costs (upfront and long term; i.e. development, maintenance, resilience, reliability, testing, …) of what you construct with each approach. Coding guidelines (such as the ones Peter mentioned) come from the battlefield. My personal preference and thus bias comes form my experience with non-trivial Prolog codebases (and from being Prolog system agnostic, which makes me an outlier here and elsewhere, I guess).

I can live with these observations (without implying I agree). The question remains on how to support the various views as good as we can. So, does my proposal solve this? Is there something better?

I need to reflect on it a bit more. That said, I would much prefer (1) no new directives and (2) SWI-Prolog (and its tools) to being able to load without issues independently of the autoload flag.

1 Like

This directive could be used throughout the system libraries. Users preferring a coffee before their application is started can ignore it and use use_module :slight_smile:

That can only happen if we introduce an alternative to autoloading that pleases the people who care where predicates come from. Without we either have to preload all the usual goodies or write a wrapper for each of them that is preloaded, loads the real thing and calls it. That seems a horrible waste to me.

Controlled autoloading used throughout the system can probably reduce the footprint considerably as way to much of the current system uses explicit use_module directives, loading stuff that is never used.

Caring where predicates come from originates in practical software engineering concerns. For example, being able to work effectively with multiple implementations of the same protocol/interface instead of being stuck with whatever single implementation is provided or being forced to deal with multiple implementations with incompatible interfaces. Good examples are random and dictionaries libraries. There are several random distributions that are useful in practice. The random library implements a single one. The assoc and rbtrees libraries both implement dictionaries but with different interfaces making it painful to evaluate and switch or to use concurrently. Is not caring about from where predicates come from more than a “one size fits all” perspective on software development?

I wrote a language server protocol implementation for Prolog that I use mainly for xref in Emacs. With that installed, I also have (define-key lsp-ui-mode-map [remap xref-find-definitions] #'lsp-ui-peek-find-definitions) and (define-key lsp-ui-mode-map [remap xref-find-references] #'lsp-ui-peek-find-references) and it works fairly well for me.

3 Likes

People like magic until magic bites them in the ass. E.g.

(What “it” is being ignored here?)

If load time is important, why aren’t the SWI-Prolog libraries installed with their .qlf code? (At least, when I looked for *.qlf files in /usr/lib/swi-prolog, I only found only 7; but there are 182 library/*.pl files (729 including subdirectories.)
I’ve noticed 1-2 order-of-magnitude performance improvements when loading large .qlf files, so presumably .qlf would reduce the time spent drinking coffee?

1 Like

In the end because it helps only significantly for larger files that are loaded often. For short files the difference is not that big as the time to find, open the file and do all the related admin gets dominant. I never tried to find the optimum. This is probably also strongly OS and disk technology dependent.

And, this discussion is not only about time, but also about memory. I have a prototype in a branch providing controlled lazy loading. That seems to work really well. My guess is that that will reduce startup time and memory footprint and allow lazy loading in applications.

1 Like