Identity and facts

I am building something that can be, like many other things, basically reduced down to the idea of todo list :slight_smile: I’d like items however to contain not only fixed fields like text and a done flag, but actually each item to be a collection of prolog facts (and potentially rules).

What is the best way to model identity of the items in a Prolog system? I am thinking easiest way would be to have each item be identified by an uuid, and then when when user asserts e.g. urgent. for that item, rewrite it to urgent(uuid_of_the_item, blue).

I was wondering if there are libraries that deal with something like this, or alternative systems that explore the concept?

Could engines be used for something like this, with each engine representing one item hence having global identity?

If an item is a collection of facts and rules a module seems to be the first thing to consider. Engines capture an execution state (stacks and goals) while a module captures a program (facts and rules). You can represent your facts and rules as thread local predicates, which means they are also “engine local”. Lots of options :slight_smile:

Note that modules are not expensive in terms of memory (or time) and you can have millions of them. They live forever though. You can make them empty, but you cannot delete them. That also holds for predicates: you can remove their clauses but not the predicate itself.

Can a query be run across modules, e.g. ?- M:urgent.?

Not directly. I’d normally populate a fact that holds the modules that represent a “world” and then call world(W), W:urgent. That doesn’t scale too wee though as it will go through all the worlds. Probably still fine with thousands, but with millions it gets unusable.

1 Like

I still haven’t figured out how to do this in Swi-Prolog without raising errors, unless urgent/0 is private to each world-module. Which always seems unsatisfying. If we have “different worlds”, surely some should have the same kinds of entities, without there being some conflict?

Very interesting! The concept of a module as a program comes very close to explaining the existence of Prolog modules in terms of a formal concept from logic programming - a logic program is, after all, a set of clauses. But it is harder to explain in formal terms what a module interface is, exactly.

Perhaps this is one reason I’m having trouble with it? Maybe the definition of a module is left a bit at hoc and it’s therefore not very easy to reason about?

In any case I know that everytime I’ve tried to realise the idea of modules as “multiple worlds”, I run into trouble and obscure errors. And I keep trying…

If you use modules as worlds one typically does not export the interface. The interface is simply the predicates that are defined inside the module. You can make the interface more explicit by either exporting as usual but loading the module using :- use_module(File, [])., i.e., not importing the interface. Alternatively you can use public/1 to define the interface. Except for allowing to reason about and documenting the interface, neither has any impact on how the interface functions, i.e…, you can still call any predicate defined in the module directly.

If all worlds define a fixed set of entry points this should work fine. If the worlds are more or less arbitrary logic programs that may not define the entire interface, the Prolog flag unknown may be used to make non-existing predicate fail silently. This flag is module specific, so you typically set it in each of the worlds.

Also useful may be add_import_module/3 and friends that define from where it imports undefined predicates.

There are lots of options :slight_smile:

Thanks, this hadn’t occurred to me.

I don’t know if what I’m trying to do is a very niche thing or not. Maybe it is and that’s why it’s hard to find a reasonably standard way to do it. In the past I took a couple of different options I could think of but they turned out to raise errors in later versions of Swi. They were likely a bit hackish.

My latest plan is to define a “world-interface” module that imports world modules via load_files with the options module(world) and redefine_module(true). So the world module is a dynamic module named “world”, the actual world modules are module files declaring modules named “world_1”, “world_2” etc (or something more meaningful!) and each world-module is, if I understand how redefine_module(true) works, destroyed ish when a new world-module is loaded. All the while, code can refer to predicates defined in the current world-module with the prefix world: even if those are not directly exported and regardless of the source file those predicates are actually loaded from. In the past I tried to do this by hand with unload_file/1 but that was one thing that caused errors.

I also recently found out about in_temporary_module/1 which might be of help.

When I got something that I think works the way I want it, this time I’ll post here and ask to make sure that it won’t change in the future. That can be a little traumatic when it happens :slight_smile:

Maybe this is a stupid suggestion, but why not just add the “world” to the facts. E.g.:

urgent(world1, id, blue).

and in another module:

urgent(world2, id, blue).

If you specify multifile, you can load multiple “worlds”. And you can explicitly set the module name in your facts, so that multiple files can export to the same module. (Although this is probably a mis-use of the module system …):

file1.pl:

:- module(file1, []).
multifile worlds:urgent/3.
worlds:urgent(world1, id, blue).

file2.pl:

:- module(file2, []).
multifile worlds:urgent/3.
worlds:urgent(world2, id, blue).

I think multifile was one of the first things I tried way back when I first grappled with this. I think the reason I rejected it then is that I don’t want to have the multifile defintions living side-by-side in the database. I want to have interchangeable worlds, but only one world loaded at a time. So I wouldnt want all the urgent/3 definitions in the database- but only one at a time.

Sorry, I know this is not clear at all. I kind of hijacked the Original Post here with my own problem and didn’t even explain my problem very well.

OK, so here’s my use case. I’m working on this ILP system called Louise. The
datasets to train Louise are organised in “experiment files” that are Prolog
modules with a common public interface exporting four “interface predicates”:
background_knowledge/2, metarules/2, positive_example/2 and
negative_example/2 (the last two are generators). Each experiment file may
include background knowledge, metarules and examples for more than one learning
problem.

Which experiment file is currently loaded is controlled with an entry in a
configuration file. In this configuration file there are clauses of the
predicate experiment_file/2. The first argument is the path to the Prolog
source file holding the experiment file module; and the second argument is the
name of the module. The module name is separate to allow modules with a file
name that doesn’t match their module name, sometimes useful. Here’s an example:

experiment_file('path/to/my/file.pl', module_name).

Now, the way it worked until Swi 8.2.1 was that when I wanted to load a new
training dataset, I would change experiment_file/2 to point to the new file,
then reload the configuration module. That would call a directive :-reload. at
the end of the configuration module that a) unloaded the previously loaded
experiment file module, b) abolished all its definitions of the four interface
predicates, c) loaded the new module and d) asserted to the dynamic database a
term holding the path and name of the newly loaded experiment file module (so
that reload/0 would know where to find the definitions of the interface
predicate to abolish, upon the next call).

The upshot of all this was that, once reload/0 was done, the definitions of
the four interface predicates in the dynamic database were now the ones in the
experiment file listed in the latest experiment_file/2 term. The definitions
in previous experiment files were abolished. The point was to know which dataset
I was currently training with, so a kind of manual garbage collection if you
will. Unfortunately, this seems to have been a bit of a hack and it stopped
working after 8.2.1. “It stopped working” in the sense that if I performed the
set of steps described above, after the first two or three times, I’d start
seeing errors about redefining a module.

My original hack worked well for my needs, but I haven’t been able to do what I
want as well ever since.

Thank you for your earlier helpful reply and apologies for the confusion!

1 Like

I don’t know how things worked pre-8.2.1, but my guess is that the method of figuring out which clauses are being reloaded changed. See predicate_property/2, nth_clause/3, clause_property/2.

One possible solution is to define your predicates as dynamic (see dynamic/2), then deleting them all before reloading (either using abolish/1 or retractall/1).

The design is getting a little clearer. A small self-contained example would help even more. I understand the experiment module setup. I do not really understand the the experiment_file/2 need for a module name. Also as you say in another post you now simply fixed this to world.

It is not clear to me where the ILP system generates it’s programs. Is this also in the world module? If all is contained in a single module the temporary module API gets interesting. It is intended to create a temporary isolated world that can use (import) from the rest of the system, but nothing is supposed to rely on the temporary module. Notably you should not import from it.

I do think that (temporary) modules are the right way to deal with worlds for ILP. It helps a lot if you can come with something concrete that we can load and run (it might of course not “work”)

Hi Jan,

It’s likely that including the name of the module as an argument to experiment_file/2 points to some sloppiness in the code.

Fixing the experiment file module name is something I contemplated as a way to achieve what I want. I gave it the name world to keep in line with the original post in this thread (which I hijacked - sorry!). In Louise, I’d fix that name to experiment_file.

I haven’t actually implemented that scheme yet, the one where the experiment file name is fixed. But I do have a proof-of-concept example, that I share below. Note that I tried this outside Louise, in a fresh project, because the experiment file concept is entangled with everything in Louise and it will take some time to actually implement it and see if it really works. So I’m not 100% sure it’s exactly what I want. But, it seems to be close.

Below is the example of the scheme I plan to use in Louise. In the example, each module is a separate Prolog source file with the basesame name as its module name.

% Responsible for loading the experiment file listed in the configuration.
:-module(load_experiment_file, [load_experiment_file/1
                               ]).

load_experiment_file(F):-
        load_files(F, [module(experiment_file)
                      ,redefine_module(true)
                      ]).


% An experiment file module
:-module(experiment_file_1, [background_knowledge/2
                            ,metarules/2
                            ,positive_example/2
                            ,negative_example/2
                            ]).

% Experiment file interface predicates.
background_knowledge(t/2, []).
metarules(t/2, []).
positive_example(t/2, []).
negative_example(t/2, []).


% Another experiment file module
:-module(experiment_file_2, [background_knowledge/2
                            ,metarules/2
                            ,positive_example/2
                            ,negative_example/2
                            ]).

background_knowledge(p/2, []).
metarules(p/2, []).
positive_example(p/2, []).
negative_example(p/2, []).

Below is a session showing the intended use of the scheme above on the command line. In practice, load_experiment_file/1 would be called by the :-reload. directive in the configuration file. This is not shown here.

% Load the named experiment file .
?- load_experiment_file(experiment_file_1).
true.

% Inspect the definitions of experiment file interface predicaes in the database.
?- listing([background_knowledge/2,metarules/2,positive_example/2,negative_example/2]).
experiment_file:background_knowledge(t/2, []).

experiment_file:metarules(t/2, []).

experiment_file:positive_example(t/2, []).

experiment_file:negative_example(t/2, []).

true.

% The database has the definitions in experiment_file_1, the loaded experiment file.
% Now, load a new experiment file:
?- load_experiment_file(experiment_file_2).
true.

% Inspect the definitions of interfce predicates in the database.
?- listing([background_knowledge/2,metarules/2,positive_example/2,negative_example/2]).
experiment_file:background_knowledge(p/2, []).

experiment_file:metarules(p/2, []).

experiment_file:positive_example(p/2, []).

experiment_file:negative_example(p/2, []).

true.

% The definitions in the new experiment file are in the database.
% Load the first experiment file again, to ensure there's no errors after multiple reloads/ unloads:
?- load_experiment_file(experiment_file_1).
true.

?- listing([background_knowledge/2,metarules/2,positive_example/2,negative_example/2]).
experiment_file:background_knowledge(t/2, []).

experiment_file:metarules(t/2, []).

experiment_file:positive_example(t/2, []).

experiment_file:negative_example(t/2, []).

true.

% And again
?- load_experiment_file(experiment_file_2).
true.

?- listing([background_knowledge/2,metarules/2,positive_example/2,negative_example/2]).
experiment_file:background_knowledge(p/2, []).

experiment_file:metarules(p/2, []).

experiment_file:positive_example(p/2, []).

experiment_file:negative_example(p/2, []).

true.

What I like in this scheme is that every query to the interface predicates can go through the prefix experiment_file: rather than the actual name of the module- so it’s not necessary anymore to have the name of the module as an argument to the experiment_file/2 configuration option.

It’s also good to see that changing experiment files does not leave behind “garbage”, i.e. the definitions of interface predicates from a previously loaded experiment file. The following query assumes experiment_file_1 is currently loaded [edit: actually, experiment_file_2 is currently loaded!]:

?- listing(experiment_file_1:P).
true.

?- listing(experiment_file_2:P).
true.

?- listing(experiment_file:P).

negative_example(p/2, []).

positive_example(p/2, []).

metarules(p/2, []).

background_knowledge(p/2, []).
true.

Like I say, I don’t know 100% that this will work with the rest of the code in Louise and it will certainly take some wrangling to fit it in. That’s fine. But do you think the above can work and be stable across versions?

More to the point, is the example clear and simple, or would you like me to try and make a better one? I have a great talent for confusing everyone I talk to… :slight_smile:

Edit: I’ve uploaded the three module files described above, for your convenience.

load_experiment_file.pl (235 Bytes)
experiment_file_1.pl (330 Bytes)
experiment_file_2.pl (330 Bytes)

Oh, about this. Louise has a number of different sub-systems that learn programs in different ways. The learned programs are local to the modules that implement the different sub-systems. When learning is completed, they are output to the command line (or passed to another sub-system for further processing; this is usually done by stringing together calls to sub-systems in a query at the command line, or in an experiment script).

In some cases, an already-learned program is loaded into a temporary module, usually called program:. For example, this is done when a learned program needs to be evaluated for predictive accuracy etc. I was thinking of making this the standard in the rest of the project, so that a learned program is always added to the program: module. I have not implemented this yet.

If I understand correctly, you are saying that nothing should import from a temporary module for example with the use_module/2 or import/1 mechanisms? I think nothing needs to do that. Other modules only need to access the clauses in the experiment_file module, or the program module, for example to call them or transform them in various ways etc, but I don’t think they’re needed as actual imports.

In fact, I think that importing the experiment file interface predicates would break the separation of concerns I’m trying to achieve with keeping the training data (represented by those predicates) in its own module. I want the training data to be as isolated as possible from the rest of the system, otherwise I can’t guarantee the integrity of experiments that use it.

It could be. I didn’t have the courage to look carefully into the code defining the module system. What a coward, eh? :slight_smile:

It was possible to abolish those predicates without declaring them dynamic. I’m always a little worried about using retractall/1 or retract/1 because if something goes wrong and the process stops mid-way, the dynamic database is left in an unknown state. More recently, I’ve made it a rule to wrap every call to database manipulation predicates to setup_call_cleanup/3 which makes it a little better. Actually, a lot better!

To be honest, I don’t remember why the interface predicates are not declared dynamic in Louise. There must have been a good reason, I’m sure <_< Oh, but I think I did try it at some point and it didn’t seem to help me fix the errors I was getting after 8.2.1.

It should. You are using official interfaces close enough to their intend that we should call this “supported”. If you redefine a module all static code from the old module is removed. As far as I recall, dynamic clauses that are loaded from the module are also removed. Asserted clauses are not.

I doubt I’d got for fixed module names. Keeping these dynamic might allow to do things concurrently which can make a huge difference. I don’t know enough of the overall design to give definite answers though …

Thanks! I’ll try to implement this scheme then and see how it works. I’ll let you know how I get on :slight_smile:

To be honest, I’m not sure if that may be needed or not. There’s some parts of the code that may well benefit from concurrent (as in parallel?) execution, but Louise is still at an early enough stage that such optimisation is not absolutely necessary. I’ll try to think about it to avoid doing things so that it’s harder to do in the future though. Thanks for pointing this out!

1 Like

Just for the record, Louise is at GitHub - stassa/louise: Polynomial-time Meta-Interpretive Learning. Do you have a comparison to what is probably the most well-known ILP system Aleph?

Hi,

Just trying to get my head to kickstart thinking about ILP …

What is actually the need that such a program learning system covers.

Why is it important to convert facts into rules?

Dan

Yes, that’s where Louise is. No comparison with Aleph has been attempted yet, as
far as I know.

Why not? The short answer is that a) Louise is sufficiently different to Aleph
that a direct comparison is not meaningful and b) I don’t like the idea of an
author comparing their own system to someone else’s. The latter practice is
common in machine learning, but I am never convinced when someone reports their
system beat every other system on some dataset. I hope it’s obvious why I’m
skeptical of such results: I can fine-tune Louise to death and “accidentally”
run Aleph with the worst possible defaults. What have we learned? That I know my
own system better than Aleph. Where is the interesting scientific result in
that? On the other hand, I would be happy to collaborate with Ashwin Srinivasan
(the author of Aleph) in a comparison where we would each be free to tweak our
own system as we thought best. It didn’t make sense to do that in the first
paper on Louise though.

On the other hand, in the Louise paper, Louise is compared to Metagol. That
makes more sense, because the creator of Metagol is the co-author of the Louise
paper. Besides, Metagol is the direct ancestor of Louise and Louise was proposed
specifically as an improvement of Metagol, so it really didn’t make sense not to
compare to Metagol. I was still not happy to have a comparison, though - I don’t
like it when machine learning research turns to a race to beat benchmarks, as it
usually does. The only thing that comes out of that is that promising research
directions are buried.

As to how Louise (and Metagol!) is different to Aleph: they belong to a new
approach to ILP, called Meta Interpretive Learning (MIL) that is for the first
time capable of learning recursive theories and performing predicate invention
without any restrictions, such as the limitations of earlier systems with
similar capabilities.

(Predicate invention is the ability to learn definitions of predicates “missing”
from the background knowledge: predicates for which examples are not given and
that are necessary to complete a learning attempt).

As an example, Aleph is based on the Inverse Entailment principle that was shown
to be incapable of learning a recursive theory without an example of a base
case. MIL overcomes this limitation. Earlier systems other than Aleph were also
limited in the structure of recursive programs they could learn and also limited
in the kind of predicate invention they could perform. MIL really removes
all the stops in this respect.

On the other hand, Louise is still largely untested and barely out of prototype
stage. Aleph is battle-tested. And of course, despite the differences with
Louise and Metagol, it is still a very powerful approach.

Short answer: Louise is the new kid on the block and must still prove itself :slight_smile:

5 Likes