Hi @kwon-young . First I want to apologise to @mathmaster13 for hijacking this thread. The moderators may want to split this whole conversation to a new topic, but that would already be a lot of work to do, so my apologies to them too.
The tl;dr
Now, there is a lot to say here, so I’ll cut through to the chase. You ask whether state-of-the-art ILP systems are able to induce the correct rules for parallelism. The answer is yes.
The way to do this is to add to the background knowledge (remember, in ILP we have two kinds of data: examples and background knowledge) suitable predicates to operate on floating point numbers. For example, you could add any of the arithmetic functions that go with is/2, or you can manually code your own arithmetic predicates, whatever best suits your needs.
The longer example
To give you an example of how to do that from my own work, below is part of the background knowledge I used in a recent project.
The project’s goal was to create an AI agent to guide a robot boat to navigate on the surface of a body of water. I wanted a way to constrain the agent’s movements to only changes of direction that take it closer to its destination (a heuristic to avoid trying out unnecessary move actions). The agent is given a map of the area and its position on the map is represented as a term position(X,Y,Theta) where all three of X, Y and Theta are floating point numbers. (Theta was not needed after all so you’ll see it’s ignored in the code). Because coordinates are floating point numbers, predicates that operate on them must be able to manipulate floating point numbers.
Here’s how this works. First we have the predicate moving_forward/2 that checks that the agent’s next move reduces the (Euclidean) distance between the agent’s current position and its destination:
%! moving_forward(+State1,+State2) is det.
%
% Ensure the agent is moving towards its destination.
%
% True when the distance of the agent's position in State1 from
% its destination is less than the distance of the agent's
% position State2 from its destination.
%
% Used as move constraint. Enabled by setting option
% move_constraint(moving_forward/2) in model configuration.
%
moving_forward(S1,S2):-
inspect(position,position(X1,Y1,_),S1)
,inspect(position,position(X2,Y2,_),S2)
,inspect(destination,destination(Xd,Yd,_),S1)
,maplist(term_value,[X1,Y1,X2,Y2,Xd,Yd],[X1_v,Y1_v,X2_v,Y2_v,Xd_v,Yd_v])
,magnitude(Xd_v/Yd_v,X1_v/Y1_v,M1)
,magnitude(Xd_v/Yd_v,X2_v/Y2_v,M2)
,M2 < M1.
The distance between the two positions is calculated by magnitude/2 as the magnitude of the vector between the two sets of their coordinates, as follows:
%! magnitude(+C1,+C2,-Magnitude) is det.
%
% Magnitude of a vector between two sets of coordinates.
%
% Example:
% ==
% ?- actions:magnitude(0/0,3/4,M).
% M = 5.0.
% ==
%
magnitude(X1/Y1,X2/Y2,M):-
Dx is X2 - X1
,Dy is Y2 - Y1
,M is sqrt(Dx^2 + Dy^2).
move_forward/2 is one of several constraints that can be imposed on the agent’s moves. These are selected before learning, with a configuration option, move_constraint/2:
%! move_constraint(?Predicate) is semidet.
%
% Predicate indicator, Symbol/arity of a move constraint.
%
% Used to choose the constraints that will apply to move actions.
%
% Known constraints:
%
% * within_action_budget/2: constraints the length of the list of
% actions in the state plan term in the state vector to the value
% of B in list_budget(action,B).
%
% * in_scan_area/2: rejects moves that end with the boat outside
% the limits defined in the scan area term in the state vector.
%
% * passable/2: rejects moves that end in unpassable map terrain
% as read off the current occlusion grid map.
%
% * moving_forward/2: ensures the boat keeps moving towards the
% coordinates of the destination term in the state vector.
%
move_constraint(within_action_budget/2).
move_constraint(in_scan_area/2).
move_constraint(passable/2).
move_constraint(moving_forward/2).
Each of those constraints is checked before a move is taken by a predicate move_constraints/2:
%! move_constraints(+State1,+State2) is det.
%
% Check move constraints in current State1 and next State2.
%
% State1, State2 are state vectors. State1 is the state vector of
% the state at the start of a move and State2 is the state vector
% at the end of a move.
%
% This predicate goes through all constraints defined in the model
% configuration option move_constraint/1 and calls each of them.
%
% Constraints must be defined as dyadic predicates. State1 and
% State2 will be passed to each such predicate.
%
% If any of those constraints fails, a move action transitioning
% from State1 to State2 will not be taken.
%
move_constraints(S1,S2):-
forall(model_configuration:move_constraint(C/2)
,(T =.. [C,S1,S2]
,call(T)
)
).
move_constraints/2 is then used to define background predicates that represent the agent’s movement from one point on the map to another, in one of eight compass directions. You can see the calls to move_constraints/2 in the listing of the first-order background knowledge, below:
?- list_mil_problem(move/2).
Positive examples
-----------------
move([destination(x(70.0),y(80.0),theta(0.0)),plan(actions([]),count(0)),scan_area(x_min(0),x_max(249),y_min(46),y_max(136)),position(x(30.0),y(60.0),theta(0.0))],[destination(x(70.0),y(80.0),theta(0.0)),plan(A,B),scan_area(x_min(0),x_max(249),y_min(46),y_max(136)),position(x(70.0),y(80.0),theta(0.0))]).
Negative examples
-----------------
Background knowledge (First Order)
----------------------------------
move_north/2:
move_north(A,B):-modify(position,increment,y,A,C),move_constraints(A,C),inspect(position,position(D,E,F),C),register_action(C,move,move_north,[D,E],B),debug(actions,'Moved North ~w',[position(D,E,F)]).
move_north_east/2:
move_north_east(A,B):-modify(position,set,theta(0.0),A,C),modify(position,increment,x,C,D),modify(position,increment,y,D,E),move_constraints(A,E),inspect(position,position(F,G,H),E),register_action(E,move,move_north_east,[F,G],B),debug(actions,'Moved North East ~w',[position(F,G,H)]).
move_east/2:
move_east(A,B):-modify(position,set,theta(0.0),A,C),modify(position,increment,x,C,D),move_constraints(A,D),inspect(position,position(E,F,G),D),register_action(D,move,move_east,[E,F],B),debug(actions,'Moved East ~w',[position(E,F,G)]).
move_south_east/2:
move_south_east(A,B):-modify(position,set,theta(0.0),A,C),modify(position,decrement,y,C,D),modify(position,increment,x,D,E),move_constraints(A,E),inspect(position,position(F,G,H),E),register_action(E,move,move_south_east,[F,G],B),debug(actions,'Moved South East ~w',[position(F,G,H)]).
move_south/2:
move_south(A,B):-modify(position,set,theta(0.0),A,C),modify(position,decrement,y,C,D),move_constraints(A,D),inspect(position,position(E,F,G),D),register_action(D,move,move_south,[E,F],B),debug(actions,'Moved South ~w',[position(E,F,G)]).
move_south_west/2:
move_south_west(A,B):-modify(position,set,theta(0.0),A,C),modify(position,decrement,y,C,D),modify(position,decrement,x,D,E),move_constraints(A,E),inspect(position,position(F,G,H),E),register_action(E,move,move_south_west,[F,G],B),debug(actions,'Moved South West ~w',[position(F,G,H)]).
move_west/2:
move_west(A,B):-modify(position,set,theta(0.0),A,C),modify(position,decrement,x,C,D),move_constraints(A,D),inspect(position,position(E,F,G),D),register_action(D,move,move_west,[E,F],B),debug(actions,'Moved West ~w',[position(E,F,G)]).
move_north_west/2:
move_north_west(A,B):-modify(position,set,theta(0.0),A,C),modify(position,increment,y,C,D),modify(position,decrement,x,D,E),move_constraints(A,E),inspect(position,position(F,G,H),E),register_action(E,move,move_north_west,[F,G],B),debug(actions,'Moved North West ~w',[position(F,G,H)]).
Background knowledge(Second Order)
----------------------------------
(Chain) ∃.P,Q,R ∀.x,y,z: P(x,y)← Q(x,z),R(z,y)
(Identity) ∃.P,Q ∀.x,y: P(x,y)← Q(x,y)
Metasubstitution constraints
----------------------------
:- dynamic configuration:metarule_constraints/2.
:- multifile configuration:metarule_constraints/2.
configuration:metarule_constraints(m(identity, P, P), fail).
configuration:metarule_constraints(m(chain, P, P, _), fail).
configuration:metarule_constraints(m(chain, _, P, P), fail).
true.
The single, positive training example is the start and end state of a move from a starting position to a destination position. The learned program is a strategy to arrive to the starting position to the destination. Below I list one example of such a learned progam. While it looks like the agent is moving all over the place, if you squint a bit you’ll see that its movement has a bias towards the north-east. That’s because the destination in the training example is to the north-east of the starting position:
?- time( learn(move/2,_Ps) ), print_clauses(_Ps), length(_Ps,N).
% 101,396,427 inferences, 9.250 CPU in 35.975 seconds (26% CPU, 10961776 Lips)
move(A,B):-move_east(A,B).
move(A,B):-move_north(A,B).
move(A,B):-move_north_east(A,B).
move(A,B):-move_north_west(A,B).
move(A,B):-move_south(A,B).
move(A,B):-move_south_east(A,B).
move(A,B):-move_east(A,C),move(C,B).
move(A,B):-move_east(A,C),move_north(C,B).
move(A,B):-move_east(A,C),move_north_east(C,B).
move(A,B):-move_east(A,C),move_south(C,B).
move(A,B):-move_east(A,C),move_south_east(C,B).
move(A,B):-move_north(A,C),move(C,B).
move(A,B):-move_north(A,C),move_east(C,B).
move(A,B):-move_north(A,C),move_north_east(C,B).
move(A,B):-move_north(A,C),move_west(C,B).
move(A,B):-move_north_east(A,C),move(C,B).
move(A,B):-move_north_east(A,C),move_east(C,B).
move(A,B):-move_north_east(A,C),move_north(C,B).
move(A,B):-move_north_east(A,C),move_north_west(C,B).
move(A,B):-move_north_east(A,C),move_south_east(C,B).
move(A,B):-move_north_west(A,C),move(C,B).
move(A,B):-move_north_west(A,C),move_north_east(C,B).
move(A,B):-move_south(A,C),move(C,B).
move(A,B):-move_south(A,C),move_east(C,B).
move(A,B):-move_south_east(A,C),move(C,B).
move(A,B):-move_south_east(A,C),move_east(C,B).
move(A,B):-move_south_east(A,C),move_north_east(C,B).
move(A,B):-move_west(A,C),move_north(C,B).
N = 28.
So the way this all comes together is that, during learning, the agent tries different moves and puts together a strategy to arrive at its goal, but it’s constrained to only try moves that reduce the distance to its goal. That’s a classic distance-based navigation heuristic.
I hope this gives you an idea of the kind of background knowledge you would have to define if you wanted to deal with floating point numbers, particularly in the context of geometry.
All of the output above is from Louise the MIL system I used in that project, but the same approaches would apply with any recent ILP system. e.g. I believe you could take the same approach with Popper.
Further discussion
Now, I would like to add something, because all this might look like too much work. And it is! Especially if all you’re trying to do is to re-derive a well-known result, like the rules of parallelism.
Unfortunately most of ILP presupposes a great deal of knowledge engineering that goes into constructing a suitable background knowledge base. In the worst case it looks like we are hand-coding the solution and giving the learning system exactly the background knowledge it needs to solve a problem, no more, and no less. That’s a fair criticism and a lot of work on ILP is subject to it.
The challenge then is to find ways to learn without having to manually define all this background knowledge. In my latest work, which I linked above, I show how to cast the ILP learning problem as a grammar learning problem, which basically means that we are trying to learn an automaton of some sort.
So, nowadays, if I was trying to learn the rules for parallelism, I would not think of geometry and floating point arithmetic, as such, but instead I would try to answer the question:
What kind of automaton accepts or generates only parallel lines?.
Then I would try to come up with examples of the output of such an automaton and give it to my system to re-construct that automaton from the examples. In that case, the First-Order background knowledge would only need the pre-terminals in the target language; essentially those would be the states of the target automaton. As to the Second-Order background knowledge, it now becomes a Second-Order Definite Normal Form, which is exactly what it sounds like : a normal form; expressed as a set of second-order definite (datalog) clauses. So for instance we could use a second-order normal form for context-free grammars, like a second-order version of Chomsky Normal Form; or a second-order normal form for context-sensitive grammars, like a second-order version of Kuroda Normal Form.
This is a more advanced method that is still under construction, so to speak, but I am much more excited about it, than the traditional, manual way of hand-crafted background knowledge I show above. However, if this is the first time you work with ILP I think you’d find the traditional way easier to reason about.