After some times optimizing the algorithm, I manage to run the algorithm on graphs with 1M nodes and edges in a few seconds.
I thought the speed would be somewhat comparable to other implementations in dynamic languages like networkx in python… Well, not even close
NetworkX manage to find the path instantly ~100ms vs 4 or 5 seconds in my prolog implementation.
If anyone has advice on how to improve the code, I would be very interested to hear your thoughts !
dijkstra
%! update_distances(+Start, +End, +NewDistance, +StateIn, -StateOut) is det.
%
% Update the tree and heap with the new distance from Start to End if the new
% distance is smaller than the current distance.
%
% Non existing node in the tree means that the node has currently an infinite
% distance to the start node.
%
% We only consider unvisited nodes since visited nodes have already found their
% smallest distance to the start node.
%
% Since the minimum priority heap implementation does not implement a decrease
% priority method, we just unconditionally add the new priority to the heap.
% This quirk is handled in `get_unvisited_min_node/5` by only retrieving unvisited
% nodes.
%
update_distances(Start, End, NewDistance, TreeIn-HeapIn, TreeOut-HeapOut) :-
( rb_lookup(End, Val, TreeIn)
-> ( Val = unvisited(CurrentDistance, _),
NewDistance < CurrentDistance
-> rb_update(TreeIn, End, unvisited(NewDistance, Start), TreeOut),
add_to_heap(HeapIn, NewDistance, End, HeapOut)
; TreeIn = TreeOut,
HeapIn = HeapOut
)
; rb_insert(TreeIn, End, unvisited(NewDistance, Start), TreeOut),
add_to_heap(HeapIn, NewDistance, End, HeapOut)
).
%! get_unvisited_min_node(+Tree, +HeapIn, +Priority, +Key, -HeapOut) is det.
%
% Retrieve the unvisited node with the minimum distance to the start node.
%
% Since nodes can be present multiple times with different priorities, we need to
% check that the minimum node we obtain is unvisited.
% An unvisited node is either node in the tree or present with a value
% `unvisited(_, _)`.
% If it was visited, we unconditionally remove it from the heap and get the next
% minimum node.
%
get_unvisited_min_node(Tree, HeapIn, P, K, HeapOut) :-
get_from_heap(HeapIn, PTmp, KTmp, HeapTmp),
( rb_lookup(KTmp, Val, Tree)
-> ( Val = unvisited(_, _)
-> HeapTmp = HeapOut,
P = PTmp, K = KTmp
; get_unvisited_min_node(Tree, HeapTmp, P, K, HeapOut)
)
; HeapTmp = HeapOut,
P = PTmp, K = KTmp
).
%! dijkstra(+Graph, +Edge, +Start, +End, -Path) is det.
%
% Compute the shortest Path from Start to End using the dijkstra algorithm.
% Graph should be the S-representation of a graph as given by the `library(ugraph)`.
%
% Implementation details:
%
% This implementation uses a red black tree for accessing and updating distances
% to the start node, the previous node and the fact that the node was already
% visited or not by using the following ideas:
%
% * non-existing nodes are unvisited nodes with infinite distances to the start node
% * existing nodes with the `unvisited(Distance, Previous)` values are unvisited nodes
% * existing nodes with the `visited(Previous)` values are visited nodes
%
% A minimum heap from `library(heaps)` is used to efficiently retrieve the node with
% the current minimum distance to the start node.
%
% The S-representation of the graph is converted to a red black tree to efficiently get
% the neighbours of a node.
%
dijkstra(Graph, Edge, Start, End, Path) :-
ord_list_to_rbtree(Graph, GraphTree),
empty_heap(Heap),
list_to_rbtree([Start-unvisited(0, no)], TreeIn),
dijkstra_(GraphTree, Edge, Start, End, 0, TreeIn, TreeOut, Heap),
once(foldl(build_path(TreeOut), ReversePath, End, Start)),
reverse([End | ReversePath], Path).
dijkstra_(Graph, Edge, Start, End, CurrentDistance, TreeIn, TreeOut, Heap) :-
rb_lookup(Start, Neighbours, Graph),
maplist(call(Edge, Start), Neighbours, NeighboursWeights),
maplist(add(CurrentDistance), NeighboursWeights, NeighboursDistances),
foldl(update_distances(Start),
Neighbours, NeighboursDistances,
TreeIn-Heap, Tree2-Heap2),
rb_update(Tree2, Start, unvisited(_, StartPrev), visited(StartPrev), Tree3),
get_unvisited_min_node(Tree3, Heap2, NextDistance, NextNode, Heap3),
( NextNode == End
-> rb_update(Tree3, End, unvisited(_, EndPrev), visited(EndPrev), TreeOut)
; dijkstra_(Graph, Edge, NextNode, End, NextDistance, Tree3, TreeOut, Heap3)
).