Advice on implementing Dijkstra's algorithm

emiruz · December 27, 2023, 7:20pm

I’d like to try and create a reference implementation of Dijkstra’s algorithm – something that could be generically used by anyone, and I wondered if I could ask for some implementation advise.

I’ve already implemented it once – badly – here. It seems to me that I need something like a heap which can be very efficiently updated because the crux of the algorithm is to constantly be picking and expanding the lowest cost node in a graph walk. Yet as the graph is walked the same nodes may be revisited, so somehow the heap needs to be updated with the lowest path lengths. The current implementation in the heaps package does not support efficient updating.

Does anyone have a sense of how this might be done very efficiently in Prolog, or perhaps of an approach that is similarly efficient but more natural in Prolog?

emiruz · December 27, 2023, 8:11pm

I don’t really mean this question to be with special reference to AoC2023 – it just made me realise that Dijkstra’s algorithm would be very useful to have as a standard tool a bit like findall or aggregate_all, that can be used as a meta predicate to find shortest paths. It could be backtrackable like setof, etc.

emiruz · December 27, 2023, 9:04pm

Dijkstra’s algorithm is usually explained in terms of a “priority queue” which supports three operations very efficiently: add_with_priority, decrease_priority and extract_min. The Fibonacci heap offers constant time complexity for all of them. If we had a Fibonacci heap structure in SWI Prolog, the Dijkstra algorithm implementation would be easy.

In reference to the AoC code, your min based tabling still has to visit the same node many times because min presumably needs to consider all options before it can know it is the min… I think Dijkstra properly implemented therefore may be orders of magnitude more efficient because nodes are not revisited for different step numbers.

emiruz · December 27, 2023, 10:27pm

No, you can use any order at all. I left an example for you in the other thread.

emiruz · December 27, 2023, 11:22pm

Yeah i just tried it using heaps to avoid append - it improves runtime by about 50% (0.14 sec vs 0.2 sec), but it is still very wasteful in Prolog.

I could implement Fibonacci heaps in Prolog but what bugs me is that there is a sense in which it will not really be Fibonacci heaps because the basic operations on Prolog code are not reading/writing to memory directly as they are in C – it is unification, and resolution. If you profile the code above, 20%+ of the time is spent on hash table related operations and I don’t immediately see how that could be if they have O(1) time complexity as they ought to.

emiruz · December 28, 2023, 8:30am

I think wherever you need an algorithm to find the shortest path between A and B, Dijkstra will beat tabling because it does not need exhaust all options in order to terminate due to the order in which nodes are visited. I do not think that tabling can be a general replacement for shortest path algorithms and that there is a value to having something like Dijkstra available to all as a meta predicate in the style of findall.

I see now though that writing Dijkstra’s algorithm is tantamount to writing a Fibonacci heap implementation – that’s by far the hardest part, and it can be used for purposes beyond Dijkstra, so its probably the bigger contribution.

emiruz · December 28, 2023, 8:51am

I’m not entirely sure what you’re arguing but all I’m saying is that there are situations in which tabling cannot supplant what a shortest path algorithm does and therefore it is useful to have it. Further, I’m not saying that shortest path is somehow an alternative to tabling, since clearly it cannot be in any situations unrelated to shortest paths . Horses for courses, as the British say.

Finally, I am noting that implementing Fibonacci heaps – rather than Dijkstra – is a better course of action because it enables more, including an efficient A* implementation amongst other things!

Hopefully that clarifies it.

BarrensZeppelin · December 29, 2023, 8:25pm

The way Dijkstra’s algorithm is implemented in practice is typically without the use of the decrease-key operation.

When you want to perform a decrease-key operation, you just insert the element again with the new priority. This leads to duplicate elements.

Therefore, when you pop an element from the heap, you just make sure to only process the element the first time it is encountered. You can keep track of already processed elements with a hash table, for example.

emiruz · December 29, 2023, 9:04pm

Yeah i did exactly that in my implementation linked above. It served fine for AOC but its slow.

Coincidentally when reading about the “longest path problem”, in the case of an acyclical graph, and after topological ordering, “…longest path in the whole DAG may be obtained by starting at the vertex v with the largest recorded value, then repeatedly stepping backwards to its incoming neighbor with the largest recorded value, and reversing the sequence of vertices found in this way. This is equivalent to running the shortest-path algorithm on −G”.

Since the above is straightforward with tabling, I suspect that its probably possible to do something equivalent to Dijkstra with tabling and none of the additional hassle in the min path case too.

hardmath · January 2, 2024, 12:30am

The request for implementation advice makes me think first of how graphs are represented.

SWI-Prolog provides library(ugraphs) which is derived from SICStus library(ugraphs):

An unweighted directed graph (ugraph) is represented as a list of (vertex-neighbors) pairs,

It it generally of interest to consider shortest paths where lengths (positive weights) are assigned to edges, which the ugraph approach omits. A convenient-to reuse Dijkstra implementation will need to provide for edge weights, and in that it might be attractive for other applications.

jan · January 2, 2024, 8:43am

Not AFAIK. The original author is claimed to be Richard O’Keefe. The SWI-Prolog library comes through YAP. I thought the library was part of the “DEC10 library”, but this seems not to be the case.

If a library is to be created, I propose to implement it as a meta predicate, e.g.,

 shortest_path(:Edge, +Start, +End, -Path).

Which calls e.g.

 call(Edge, +From, -To, -Distance).

hardmath · January 2, 2024, 7:44pm

I may have inadequately cited the history. The YAP Manual Sec. 7.19 Unweighted Graphs says:

The following graph manipulation routines are based in code originally written by Richard
O’Keefe. The code was then extended to be compatible with the SICStus Prolog ugraphs
library.

The YAP manual goes on to further sections about representing directed vs. undirected graphs.

emiruz · January 2, 2024, 7:58pm

I had personally concluded that the best thing to do here is just implement Fibonacci heaps because it makes Dijkstra and other such algorithms easy to implement, and it is useful for other purposes.

kilian · January 2, 2024, 10:00pm

A few years ago I have taken an implementation of Carlo and Eike SR. (https://copyprogramming.com/howto/dijkstras-shortest-path-algorithm, GitHub - eikesr/wgraphs) and created a wgraph module… I am still using it wo any problem, if you need, I can send it… yours: Imre

:- module(wgraphs, [vertices_edges_to_wgraph/3,
add_edges/3,
error/2, reportError/1,
symmetricError/2, reportSymmetricError/1,
union/3,symmetric_closure/2, transpose/2,
dijkstra/3, min_path/5
]).

CapelliC · January 3, 2024, 7:41pm

Such a strange world…

has been some days now that I was wondering how I could retrieve from the haze of the past that my old code… I developed it after studying the code by Markus Triska, that provides SCC…

anyway, I think @emiruz will provide something better.

emiruz · January 4, 2024, 7:54am

Thanks that is surprising … I automatically assume that if an algorithm was mathematically shown to have certain O properties then there is an implementation which realises them, but apparently not in this case. Further, optimisations around array and the frequency of L1 cache misses seem to be the most important factors (in Fortran based implementations). I’m not sure if it implies anything at all for a Prolog based implementation though.

kwon-young · January 6, 2024, 5:28am

So I have spent a few days trying to make a clean implementation of the dijkstra algorithm using red black trees and a heap.
The implementation is very similar to @emiruz version except that I use a single red black tree to store visited and unvisited nodes, the distance to the start node and the previous node of a node.

As @emiruz has pointed out, the runtime is mainly dominated by the red black tree lookups.

After some times optimizing the algorithm, I manage to run the algorithm on graphs with 1M nodes and edges in a few seconds.
I thought the speed would be somewhat comparable to other implementations in dynamic languages like networkx in python… Well, not even close
NetworkX manage to find the path instantly ~100ms vs 4 or 5 seconds in my prolog implementation.

If anyone has advice on how to improve the code, I would be very interested to hear your thoughts !

Here is an inline copy of the important parts:

dijkstra

%! update_distances(+Start, +End, +NewDistance, +StateIn, -StateOut) is det.
%
%  Update the tree and heap with the new distance from Start to End if the new
%  distance is smaller than the current distance.
% 
%  Non existing node in the tree means that the node has currently an infinite
%  distance to the start node.
%
%  We only consider unvisited nodes since visited nodes have already found their
%  smallest distance to the start node.
%
%  Since the minimum priority heap implementation does not implement a decrease
%  priority method, we just unconditionally add the new priority to the heap.
%  This quirk is handled in `get_unvisited_min_node/5` by only retrieving unvisited
%  nodes.
%
update_distances(Start, End, NewDistance, TreeIn-HeapIn, TreeOut-HeapOut) :-
   (  rb_lookup(End, Val, TreeIn)
   -> (  Val = unvisited(CurrentDistance, _),
         NewDistance < CurrentDistance
      -> rb_update(TreeIn, End, unvisited(NewDistance, Start), TreeOut),
         add_to_heap(HeapIn, NewDistance, End, HeapOut)
      ;  TreeIn = TreeOut,
         HeapIn = HeapOut
      )
   ;  rb_insert(TreeIn, End, unvisited(NewDistance, Start), TreeOut),
      add_to_heap(HeapIn, NewDistance, End, HeapOut)
   ).

%! get_unvisited_min_node(+Tree, +HeapIn, +Priority, +Key, -HeapOut) is det.
%
%  Retrieve the unvisited node with the minimum distance to the start node.
%
%  Since nodes can be present multiple times with different priorities, we need to
%  check that the minimum node we obtain is unvisited.
%  An unvisited node is either node in the tree or present with a value
%  `unvisited(_, _)`.
%  If it was visited, we unconditionally remove it from the heap and get the next
%  minimum node.
%
get_unvisited_min_node(Tree, HeapIn, P, K, HeapOut) :-
   get_from_heap(HeapIn, PTmp, KTmp, HeapTmp),
   (  rb_lookup(KTmp, Val, Tree)
   -> (  Val = unvisited(_, _)
      -> HeapTmp = HeapOut,
         P = PTmp, K = KTmp
      ;  get_unvisited_min_node(Tree, HeapTmp, P, K, HeapOut)
      )
   ;  HeapTmp = HeapOut,
      P = PTmp, K = KTmp
   ).

%! dijkstra(+Graph, +Edge, +Start, +End, -Path) is det.
%
%  Compute the shortest Path from Start to End using the dijkstra algorithm.
%  Graph should be the S-representation of a graph as given by the `library(ugraph)`.
%
%  Implementation details:
%
%  This implementation uses a red black tree for accessing and updating distances
%  to the start node, the previous node and the fact that the node was already
%  visited or not by using the following ideas:
%
%  * non-existing nodes are unvisited nodes with infinite distances to the start node
%  * existing nodes with the `unvisited(Distance, Previous)` values are unvisited nodes
%  * existing nodes with the `visited(Previous)` values are visited nodes
%
%  A minimum heap from `library(heaps)` is used to efficiently retrieve the node with
%  the current minimum distance to the start node.
%
%  The S-representation of the graph is converted to a red black tree to efficiently get
%  the neighbours of a node.
%
dijkstra(Graph, Edge, Start, End, Path) :-
   ord_list_to_rbtree(Graph, GraphTree),
   empty_heap(Heap),
   list_to_rbtree([Start-unvisited(0, no)], TreeIn),
   dijkstra_(GraphTree, Edge, Start, End, 0, TreeIn, TreeOut, Heap),
   once(foldl(build_path(TreeOut), ReversePath, End, Start)),
   reverse([End | ReversePath], Path).

dijkstra_(Graph, Edge, Start, End, CurrentDistance, TreeIn, TreeOut, Heap) :-
   rb_lookup(Start, Neighbours, Graph),
   maplist(call(Edge, Start), Neighbours, NeighboursWeights),
   maplist(add(CurrentDistance), NeighboursWeights, NeighboursDistances),
   foldl(update_distances(Start),
         Neighbours, NeighboursDistances,
         TreeIn-Heap, Tree2-Heap2),
   rb_update(Tree2, Start, unvisited(_, StartPrev), visited(StartPrev), Tree3),
   get_unvisited_min_node(Tree3, Heap2, NextDistance, NextNode, Heap3),
   (  NextNode == End
   -> rb_update(Tree3, End, unvisited(_, EndPrev), visited(EndPrev), TreeOut)
   ;  dijkstra_(Graph, Edge, NextNode, End, NextDistance, Tree3, TreeOut, Heap3)
   ).

emiruz · January 6, 2024, 9:09am

I feel that Dijkstra and the related data structures are inherently imperative and just will be much slower in Prolog when compared to imperative languages due to the overhead. I suspect heaps, hash tables, and other structures used by networkx are probably written in C even if the shortest path algorithm itself isn’t. It may also be worth checking what algorithm networkx actually uses for shortest paths to make sure you are comparing like for like.

I think the approach may need to be rethought from scratch using dynamic programming so that it might elegantly fit a unification + tabling approach. I’ve had the opportunity to solve 25 AoC puzzles just recently and by far the most difficult in Prolog and least satisfying are those which I solved using traditional imperative algorithms.

kwon-young · January 6, 2024, 10:12am

I believe the python implementation of the dijkstra algorithm in networkx is here.
NetworkX only use python native datastructure, although you could argue that python dict are implemented in C ^^

In my opinion, there is nothing inherently imperative with the dijkstra algorithm. As we both found out, the bottleneck of the algorithm has to do with indexing speed of a large map.
And indexing is not more imperative than declarative, although it has always been a weak point of prolog.

emiruz · January 6, 2024, 11:50am

I guess I’d call it inherently imperative because the computational model is a separate data structure + incremental updates to it. Rather than – for example – something moreso resembling a “this is what I want the result to be” representation where Prolog is most beautiful.

Maybe another way of putting it is that – it seems to me – Prolog excels where problems can be stated in terms of a description of consequences rather than a description of computational procedure. At least to me that summarises the “Prolog state of mind” in which I feel at one with the language as it was “meant to be used”.

Topic		Replies	Views
Assignment: Dijkstra algorithm, find All paths Help! how-to	0	1634	December 9, 2019
Libraries: queues and heaps Resources	38	741	May 27, 2024
Implementing Quine's algorithm Algorithm how-to	105	4350	September 16, 2020
Find the shortest path in between nodes Help! how-to	3	6897	May 12, 2020
Learning to implement data structures & algorithms in Prolog Help! how-to	3	1349	April 26, 2022

Advice on implementing Dijkstra's algorithm

Related topics