"All that is gold does not glitter, Not all those who wander are lost…" — J. R. R. Tolkien
Below is an implementation of a graph structure and breadth-first search (BFS) algorithm that allows each vertex or node to contain any payload you like – a map, a string, whatever.
- Why a graph algorithm?
- The Graph and the Vertex
- Fundamental Operations
- The Search they call Breadth First
- Appendix: Creating a Vertex Explorer with async.core
Why a dang graph algorithm?
Graph traversal algorithms like Breadth-First Search and its close cousin, Depth-First Search, open up fast solutions to some interesting problems including:
- Graph coloring, used in drawing maps, creating compilers, and playing Sudoku
- Finding articulation vertices, which are likely points of failure in graphs such as telephone networks, internet networks, and more
- Making your way out of a maze by the shortest path possible
After starting a new job working with Python and JavaScript, I felt the itch to learn something in the functional tradition. Eventually I settled on Clojure, a dialect of the venerable Lisp. Working with algorithms and data structures is one of the best ways to learn a new language, so I translated a breadth-first search algorithm into Clojure, and in the process found some interesting differences between the Lispey code and the classic imperative style.
Much credit is due to the free book Problem Solving with Algorithms and Data Structures using Python, which provided the basis for the graph data structure implementation, and The Algorithm Design Manual by Steven Skiena, which inspired the BFS algorithm used here. I recommend both to anyone who'd like to beef up their algorithmic chops.
The Graph and the Vertex
First off, let's define our data structures. Each Vertex has a key, a value, and a list of connected vertices. Each Graph is just a collection of vertices.
(defrecord Graph [vertices])
(defrecord Vertex
[key value connections parent])
The Graph
type contains just one morsel of data:
vertices
: a Clojure map object, where each key is aVertex
object's key, and each value is theVertex
object itself.
The Vertex
has a few more attributes:
key
: a unique identifier representing this vertexvalue
: an arbitrary payload you'd like to associate with the vertexconnections
: a set of keys to vertices that are connected to this node (often called edges or links)parent
: the key to this vertex's parent, to be used in the BFS implementation later
Fundamental operations
We'll use a few basic operations to experiment with our new structure:
- Add a new vertex to the graph
- Get an existing vertex from the graph
- Create an edge, or connection, between two existing vertices
- Check if two vertices are connected
We'll implement each of these in a purely functional way, meaning the function performing each operation has no side effects (it just returns a new output based on the input) and the function's return value is always the same for the same inputs. This is a powerful way to program, drawing deep from the mighty roots of mathematics, and luckily Clojure makes it natural to work this way!
(defn add-vertex
"Add a new vertex with the given `key` and `value` to a `graph`."
[{:keys [vertices] :as graph} key value]
(->Graph (assoc
vertices
key
(->Vertex key value #{} nil))))
(defn get-vertex
[{:keys [vertices] :as graph} key]
(get vertices key))
(defn add-edge-directed
"Connects vertex `u` to `v` with a one-way / directed edge."
[u v]
(assoc u :connections (conj (:connections u) (:key v))))
(defn add-edge
"Creates an undirected edge between vertices `u` and `v`."
[{:keys [vertices] :as graph} u v]
(->Graph
;; Notice the thread "->" operation here, which inserts `vertices`
;; as the first argument of the two `assoc` calls
(-> vertices
(assoc (:key u) (add-edge-directed u v))
(assoc (:key v) (add-edge-directed v u)))))
(defn connected?
"Returns `true` if `u` has an edge to `v`; otherwise, returns `false`."
[u v]
(contains? (:connections u) (:key v)))
A couple of things to note here:
- Each function that "modifies” the graph (
add-vertex
,add-edge-directed
, andadd-edge
) returns a new Graph object, rather than editing the existing object. In terms of writing pure functions, this is an example of "no side effects”. - Our
add-edge
function returns an undirected edge, meaning that ifu
is connected tov
, thenv
is also connected tou
. Depending on your problem, you might want to instead useadd-edge-directed
to make a one-way connection.
In addition to the operations above, we'll add a couple of utility functions to print our graph:
;; These functions are, to use a word Nathaniel Hawthorne's grandfather would
;; use to describe our modern ways, "impure" because they have the side effect
;; of printing to the screen.
;; It turns out that all useful things are, indeed, impure.
(defn print-vertex [v]
(println (str
"\t" (:key v)
"\t- conn: " (:connections v)
"\t- parent: " (:parent v))))
(defn print-graph [{:keys [vertices] :as graph}]
(println "Graph:")
(doseq [v (map val vertices)]
(print-vertex v))
graph)
Now for the fun part: testing our new structure and seeing what it looks like!
Here we'll use the handy threading macro as->
to progressively transform our Graph object. Each call to add-vertex/edge
returns a new Graph, which the as->
macro assigns to the name g
for use in the next line.
(as-> (->Graph {}) g
(add-vertex g "A" 1)
(add-vertex g "B" 2)
(add-edge g (get-vertex g "A") (get-vertex g "B"))
(print-graph g))
;; -- Output --
;; Graph:
;; A - conn: #{"B"} - parent:
;; B - conn: #{"A"} - parent:
(as-> (->Graph {}) g
(add-vertex g "Answer" 42)
(add-vertex g "Question" "Meaning?")
(add-vertex g "Thanks" "Fish")
(add-vertex g "Bob" {:occupation :painting-genius})
(add-edge g (get-vertex g "Answer") (get-vertex g "Question"))
(add-edge g (get-vertex g "Answer") (get-vertex g "Thanks"))
(print-graph g)
(println
"Are Answer and Question connected? "
(connected? (get-vertex g "Answer") (get-vertex g "Question"))))
;; -- Output --
;; Graph:
;; Answer - conn: #{"Question" "Thanks"} - parent:
;; Question - conn: #{"Answer"} - parent:
;; Thanks - conn: #{"Answer"} - parent:
;; Bob - conn: #{} - parent:
;; Are Answer and Question connected? true
The Search they call Breadth First
Now for the actual Breadth-First Search algorithm! In pseudo-code, the algorithm can be described like this:
BFS(graph, start-node)
Mark all vertices in `graph` as "undiscovered"
Mark `start-node` as "discovered"
Create a FIFO (first-in first-out) queue `Q` initially containing `start-node`
While `Q` is not empty:
Pop the first element from `Q`, assign to `u`
Process vertex `u` as needed
For each vertex `v` adjacent to `u`:
Process edge `(u, v)` as needed
If `v` is "undiscovered":
Mark `v` as "discovered"
Set `v`'s parent to `u`
Add `v` to `Q`
Mark `u` as "completely explored"
This is in a somewhat imperative style, so we`ll make some adjustments to utilize the Power of the Lambda.
Our breadth first search function will accept a graph object and a starting vertex. It will return the graph with each vertex's :parent
property updated to the appropriate parent's key for a BFS path.
;; For ClojureScript programs, replace both instances of
;; `clojure.lang.PersistentQueue/EMPTY` below with `#queue []`
(use '[clojure.string :only (join)])
(defn bfs
"Return `graph` with each vertex's `:parent` property updated."
[graph start]
(loop [discovered clojure.lang.PersistentQueue/EMPTY
discovered-map {(:key start) true}
u start
new-graph graph]
(if (nil? u)
;; Base case: all vertices have been explored
new-graph
;; Queue any of the current node's neighbors that haven't been
;; discovered
(let [neighbor-keys
(filter #(not (contains? discovered-map %))
(:connections u))
neighbors
(map #(assoc
(get-vertex graph %) :parent (:key u))
neighbor-keys)
new-discovered
(as-> discovered ds
(into clojure.lang.PersistentQueue/EMPTY
(concat ds neighbors)))]
;; Further processing of `u` or the `(u, v)` edges can go here
(println (str "Exploring vertex " (:key u) ", neighbors: "
(join ", " neighbor-keys)))
;; Proceed to exploring the next vertice, adding `u`'s
;; neighbors to the discovered pile
(recur
(pop new-discovered)
(into discovered-map
(map #(hash-map % true) neighbor-keys))
(peek new-discovered)
;; Update each of `u`'s neighbors to show `u` is the parent
(reduce #(assoc-in %1 [:vertices %2 :parent] (:key u))
new-graph
neighbor-keys))))))
The
discovered-map
property here is just used to see which vertices have already been discovered, more efficiently than iterating over thediscovered
queue.
Appendix: Creating a Vertex Explorer with async.core
Notice the println
in the above code? You can replace that with any processing you need to do on the graph, such as recording results in another data structure or emitting your next Sudoku move.
But what if your program has multiple uses for the BFS, or you don't want to embed application logic inside of the graph algorithm? In this case, we can harness Clojure's async.core system to create a channel that emits each vertex as it gets explored.
This only requires a miniscule change to our function:
;; Note: for ClojureScript programs, replace both instances of
;; `clojure.lang.PersistentQueue/EMPTY` below with `#queue []`
(use '[clojure.string :only (join)]
'[cljs.core.async
:as async
:refer [>! <! go chan take]])
(defn bfs-channel
"Return a 2-tuple where the first element is `graph`, with each
vertex's `:parent` property updated, and the second element is an
async.core channel that emits each vertex as it is explored."
[graph start]
(let [out (chan)]
(loop [discovered clojure.lang.PersistentQueue/EMPTY
discovered-map {(:key start) true}
u start
new-graph graph]
(if (nil? u)
;; Base case: all vertices have been discovered
[new-graph
(take (count (:vertices new-graph)) out)]
;; Get all of the current node's neighbors that haven't been
;; discovered
(let [neighbor-keys
(filter #(not (contains? discovered-map %))
(:connections u))
neighbors
(map #(assoc
(get-vertex graph %) :parent (:key u))
neighbor-keys)
new-discovered
(as-> discovered ds
(into clojure.lang.PersistentQueue/EMPTY
(concat ds neighbors)))]
;; Emit the vertex being explored to `out`
(go (>! out u))
;; Proceed to exploring the next vertice, adding `u`'s
;; neighbors to the discovered pile
(recur
(pop new-discovered)
(into discovered-map
(map #(hash-map % true) neighbor-keys))
(peek new-discovered)
;; Update each of `u`'s neighbors to show `u` is the parent
(reduce #(assoc-in %1 [:vertices %2 :parent] (:key u))
new-graph
neighbor-keys)))))))
Now we can read from the out
channel, and do all processing outside of the bfs
function.
Freakin' sweet!
I hope you've enjoyed reading this article half as much as I've enjoyed playing with these graphs.
If you think of ways to improve this (I'm sure there are some!) or want to work with me on a project, feel free to shoot me a message!
Good luck, and happy hacking.