Got.λ - Gothenburg Functional Programming Meetup
A case study involving graphs
Johan Lodin, 2018-10-24
github.com/jolod
I have added comments to the original presentation. These are always marked with "Comment:" if inline, or as separate slides (such as this one).
The code snippets are written in PureScript which is deceptively similar to Haskell for records. The key differences for this presentation is that you need to spell out type variables using forall in PureScript, and that records look very similar but work slightly differently.
A directed graph consists of
- a set of vertices
- a set of pairs of vertices (edges)
Example:
newtype SetGraph a = SetGraph
{ vertices :: Set a
, edges :: Set (Tuple a a) }newtype MapGraph a = MapGraph (Map a (Set a))data Algebraic a
= Empty
| Vertex a
| Overlay (Algebraic a) (Algebraic a)
| Connect (Algebraic a) (Algebraic a)(Also matrix.)
Will use MapGraph for the rest of the slides. Something to be aware of is that the interpretation of MapGraph $ Map.insert "a" (Set.singleton "b") Map.empty contains two vertices and one edge. That is, it is not sufficient to look at Map.keys to get all the vertices.
For an explanation of Algebraic, look at https://blogs.ncl.ac.uk/andreymokhov/an-algebra-of-graphs/.
Basic queries:
- vertex existence
- edge existence
- edges from/to
--
Basic modifications:
- add vertex
- add edge
- remove edge
- remove vertex (?)
--
Typical algorithms:
- reach from vertex
- shortest path
- transpose
- topological sorting (if acyclic)
--
Time complexity depends on data structure. I needed:
- Fast lookup of edges from a vertex (for e.g. reach).
- Fast reverse edge lookup (for e.g. topological sorting).
data DoubleMapGraph a = DoubleMapGraph {graph :: MapGraph a, transposed :: MapGraph a}--
empty :: forall a. DoubleMapGraph a
empty = DoubleMapGraph {graph: MapGraph.empty, transposed: MapGraph.empty}--
addVertex :: forall a. Ord a => a -> DoubleMapGraph a -> DoubleMapGraph a
addVertex vertex (DoubleMapGraph {graph, transposed}) = DoubleMapGraph
{ graph: MapGraph.addVertex vertex graph
, transposed: MapGraph.addVertex vertex transposed }--
addEdge :: forall a. Ord a => a -> a -> DoubleMapGraph a -> DoubleMapGraph a
addEdge from to (DoubleMapGraph {graph, transposed}) = DoubleMapGraph
{ graph: MapGraph.addEdge from to graph
, transposed: MapGraph.addEdge to from transposed }--
removeEdge :: forall a. Ord a => a -> a -> DoubleMapGraph a -> DoubleMapGraph a
removeEdge from to (DoubleMapGraph {graph, transposed}) = DoubleMapGraph
{ graph: MapGraph.removeEdge from to graph
, transposed: MapGraph.removeEdge to from transposed }You get the idea.
edgesFrom :: forall a. Ord a => a -> DoubleMapGraph a -> Maybe (Set a)
edgesFrom vertex (DoubleMapGraph {graph}) = MapGraph.edgesFrom vertex graph--
edgesTo :: forall a. Ord a => a -> DoubleMapGraph a -> Maybe (Set a)
edgesTo vertex (DoubleMapGraph {transposed}) = MapGraph.edgesFrom vertex transposed--
reachFrom :: forall a. Ord a => a -> DoubleMapGraph a -> Maybe (Set a)
reachFrom vertex (DoubleMapGraph {graph}) = MapGraph.reachFrom vertex graph--
reachTo :: forall a. Ord a => a -> DoubleMapGraph a -> Maybe (Set a)
reachTo vertex (DoubleMapGraph {transposed}) = MapGraph.reachFrom vertex transposed--
Annoying pattern.
Can do better! Suggestions?
graph (DoubleMapGraph {graph}) = graph
transpose (DoubleMapGraph {graph, transposed}) = DoubleGraph
{ graph: transposed
, transposed: graph }Bad idea if not immutable.
--
transposed = transpose >>> graph
edgesFrom vertex = MapGraph.edgesFrom vertex <<< graph
edgesTo vertex = MapGraph.edgesFrom vertex <<< transposed
reachFrom vertex = MapGraph.reachFrom <<< graph
reachTo vertex = MapGraph.reachFrom <<< transposed--
Need not be in the DoubleMapGraph module!
Compare with mutable OO and encapulation:
- Need to delegate every behavior of the underlying graph class.
- And guard against leaking a reference or allowing mutation in any way.
- Or, return a clone.
What we have so far:
data DoubleMapGraph a = DoubleMapGraph {graph :: MapGraph a, transposed :: MapGraph a}
empty :: forall a. DoubleMapGraph a
addVertex :: forall a. Ord a => a -> DoubleMapGraph a -> DoubleMapGraph a
addEdge :: forall a. Ord a => a -> a -> DoubleMapGraph a -> DoubleMapGraph a
removeEdge :: forall a. Ord a => a -> a -> DoubleMapGraph a -> DoubleMapGraph aSuggestions?
I anticipated the suggestion of a type class, and have a slide for that later, but ultimately I will not use type classes.
Need to loop over all vertices and remove incoming edges as well.
removeAll :: forall a. Ord a => a -> MapGraph a -> MapGraph a
removeAll vertex (MapGraph m) = MapGraph $
map (Set.delete vertex) $ Map.delete vertex mBut we know which vertices point to it.
This is one of the reasons we want O(1) transpose!
Consider how this works with the type class approach. (It doesn't work well.)
If we have the transposed graph, we can make removeAll much more efficient. We just look up all the vertices that point to the vertex (i.e. a single map lookup in the transposed graph) and then we just iterate over those values and removed the vertex from the sets.
module Graph where
class Graph g a where
empty :: g a
addVertex :: a -> g a -> g a
addEdge :: a -> a -> g a -> g a
removeEdge :: a -> a -> g a -> g a--
instance mapGraphGraph :: Ord a => Graph MapGraph a where
empty = MapGraph.empty
addVertex = MapGraph.addVertex
addEdge = MapGraph.addEdge
removeEdge = MapGraph.removeEdge--
module DoubleGraph where
data DoubleGraph g a = DoubleGraph {graph :: g a, transposed :: g a}
empty :: forall g a. Graph g a => DoubleGraph g a
addVertex :: forall g a. Graph g a => a -> DoubleGraph g a -> DoubleGraph g a
addEdge :: forall g a. Graph g a => a -> a -> DoubleGraph g a -> DoubleGraph g a
removeEdge :: forall g a. Graph g a => a -> a -> DoubleGraph g a -> DoubleGraph g a
removeEdge from to (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: Graph.removeEdge from to graph
, transposed: Graph.removeEdge to from transposed }Observation: all functions only use a single function each from Graph.
Ponder removeAll.
(You could have added more methods, such as vertices and edgesFrom etc, but since we will take a different direction I just examplify with the methods above.)
How would we write removeAll using a type class for the graph? By doing that, we are not allowing ourselves to exploit the fact that we know the transposed graph. We can only use the methods provided by Graph, and the implementation in Graph cannot know about our graph in transposed.
Next, we will see what happens if you don't use a type class, and just use plain functions as the tool for abstraction.
(cont.)
If we take the observation that we only need one method each from the type class to heart, we'd might rewrite it as
class GraphEmpty g a where empty :: g a
class GraphAddVertex where addVertex :: a -> g a -> g a
class GraphAddEdge where addEdge :: a -> a -> g a -> g a
class GraphRemoveEdge where removeEdge :: a -> a -> g a -> g a
empty :: forall g a. GraphEmpty g a => DoubleGraph g a
addVertex :: forall g a. GraphAddVertex g a => a -> DoubleGraph g a -> DoubleGraph g a
addEdge :: forall g a. GraphAddEdge g a => a -> a -> DoubleGraph g a -> DoubleGraph g a
removeEdge :: forall g a. GraphRemoveEdge g a => a -> a -> DoubleGraph g a -> DoubleGraph g aBut now, what is the difference between GraphAddVertex g a => and (a -> g a -> g a) ->?
This hints to when you want to group several methods into a type class: if the methods are not completely independent and have some constraints. These are often called "laws". A simple such constraint is for the type class Ord. Ord requires two methods: == (inherited from Eq) and <=. Now, a == b implies a <= b. So it makes sense to say that for a certain type we have this set of behaviors that "work together".
Another example are the famous monad laws, which relate return and >>=.
data DoubleGraph g = DoubleGraph {graph :: g, transposed :: g}Note: no a.
Comment: There's no need to assume that our graph type has a type parameter. See the last slide for an example of this.
data DoubleGraph g = DoubleGraph {graph :: g, transposed :: g}empty :: forall g. g -> DoubleGraph g
empty empty' = DoubleGraph
{ graph: empty'
, transposed: empty' }Comment: Instead of using Graph.empty we pass that empty value for g in as an argument.
data DoubleGraph g = DoubleGraph {graph :: g, transposed :: g}empty :: forall g. g -> DoubleGraph g
empty empty' = DoubleGraph
{ graph: empty'
, transposed: empty' }addVertex :: forall g h a. (a -> g -> h) -> a -> DoubleGraph g -> DoubleGraph h
addVertex addVertex' vertex (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: addVertex' vertex graph
, transposed: addVertex' vertex transposed }Comment: The same principle is used for Graph.addVertex. The only difference is as that the argument is a function, and the return value is a function (remember that the type could have been written as forall g h a. (a -> g -> h) -> (a -> DoubleGraph g -> DoubleGraph h)).
Comment: We see that addVertex "lifts" a graph operation into a double graph operation.
data DoubleGraph g = DoubleGraph {graph :: g, transposed :: g}empty :: forall g. g -> DoubleGraph g
empty empty' = DoubleGraph
{ graph: empty'
, transposed: empty' }addVertex :: forall g h a. (a -> g -> h) -> a -> DoubleGraph g -> DoubleGraph h
addVertex addVertex' vertex (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: addVertex' vertex graph
, transposed: addVertex' vertex transposed }addEdge :: forall g h a. (a -> a -> g -> h) -> a -> a -> DoubleGraph g -> DoubleGraph h
addEdge addEdge' from to (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: addEdge' from to graph
, transposed: addEdge' to from transposed }--
removeEdge :: forall g h a. (a -> a -> g -> h) -> a -> a -> DoubleGraph g -> DoubleGraph h
removeEdge removeEdge' from to (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: removeEdge' from to graph
, transposed: removeEdge' to from transposed }--
How does addEdge and removeEdge differ?
Answer: By name only!
data DoubleGraph g = DoubleGraph {graph :: g, transposed :: g}empty :: forall g. g -> DoubleGraph g
empty empty' = DoubleGraph
{ graph: empty'
, transposed: empty' }updateVertex :: forall g h a. (a -> g -> h) -> a -> DoubleGraph g -> DoubleGraph h
updateVertex updateVertex' vertex (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: updateVertex' vertex graph
, transposed: updateVertex' vertex transposed }updateEdge :: forall g h a. (a -> a -> g -> h) -> a -> a -> DoubleGraph g -> DoubleGraph h
updateEdge updateEdge' from to (DoubleGraph {graph, transposed}) = DoubleGraph
{ graph: updateEdge' from to graph
, transposed: updateEdge' to from transposed }Rename addVertex to updateVertex and rename addEdge/removeEdge to updateEdge.
--
- Can change type!
updateVertextoo powerful, can break the invariant.- We can remove a key from the map in
MapGraph, which will remove the key in the transpose as well. The two graphs are now no longer transposes of each other. - Actually a feature?
- We can remove a key from the map in
Reminder:
data DoubleGraph g = DoubleGraph
{ graph :: g
, transposed :: g }
empty :: forall g. g -> DoubleGraph g
updateVertex :: forall g h a. (a -> g -> h) -> a -> DoubleGraph g -> DoubleGraph h
updateEdge :: forall g h a. (a -> a -> g -> h) -> a -> a -> DoubleGraph g -> DoubleGraph h--
module DoubleMapGraph where
empty = DoubleGraph.empty MapGraph.empty
addVertex = DoubleGraph.updateVertex MapGraph.addVertex
addEdge = DoubleGraph.updateEdge MapGraph.addEdge
removeEdge = DoubleGraph.updateEdge MapGraph.removeEdgeThat's it!
--
removeAll :: forall a. Ord a => a -> DoubleGraph (MapGraph a) -> DoubleGraph (MapGraph a)
removeAll vertex dg =
DoubleGraph.updateVertex MapGraph._removeVertex vertex
$ reduce (flip removeEdge vertex) dg
$ fromMaybe Set.empty (edgesTo vertex dg)Note that I use MapGraph._removeVertex because it's not supposed to be used generally. It's behavior is "remove outgoing edges and if there are no incoming edges also remove the vertex". It might be what you want, but it's at the very least a terrible name. :-) (In fact, it is precisely the function you want for some topological sorting algorithms, since you remove vertices only when there are no incoming edges.)
The DoubleMapGraph module defined on the previous slide is completely optional, but for me I mostly used the same underlying graph, and then I could just as well make use of that and utilize the benefits of an efficient removeAll.
If you have a Graph type class, you can make DoubleGraph an instance for all graphs that also have a Graph instance:
instance graphDoubleGraph :: Graph g => Graph (DoubleGraph g) where
empty = empty Graph.empty
addVertex = updateVertex Graph.addVertex
addEdge = updateEdge Graph.addEdge
removeEdge = updateEdge Graph.removeEdgeSo this is not in opposition of a type class, but we see now little code we need to provide the core functionality if we abstract over functions instead of going directly for a type class, and we see how reusable it is since we can use it to built both a specific type with optimized behavior as well as provide an implementation for a graph type class.
Result: a module that
- solves one problem,
- solves only that problem,
- provides behaviors à la carte
- with minimal boiler plate.
Conclusions:
- Profit from immutability.
- Only use type classes when needed.
- Be wary of "singleton abstractions".
- (The best interfaces have a single method (i.e. functions).)
It's really worth pondering what we actually did here. The DoubleGraph module only contains functions for manipulating vertices and edges. It's not really supposed to be used directly; specifically since updateVertex is a bit too powerful, but that power is utilized to provide more efficient implementations. So I think the module provides functions at exactly the right level. You can make sure that the power of addVertex is not abused by providing your own interface as I did in the last implementation of DoubleMapGraph. In that way it is akin to an abstract base class, which has access to protected members and behaviors.
We also made use of immutability to only focus the behaviors on only manipulation of vertices and edges, without sacrificing any reusability when it comes to all other behaviors, since we provided graph and transpose.
How to not use three different but equivalent graph modules at the same time
PureScript supports row-polymorphic records.
topoSort :: forall graph a f r. Ord a => Foldable f
=> { removeEdge :: a -> a -> graph -> graph
, removeVertex :: a -> graph -> graph
, isEmpty :: graph -> Boolean
, edgesFrom :: a -> graph -> Set a
| r }
-> { vertices :: f a
, graph :: graph
, transposed :: graph }
-> Either graph (List a)This is just taking the ideas from DoubleGraph "to the extreme". The first argument provides all the graph behaviors the algorithm requires. The second argument takes information about the graph. vertices :: f a is a collection of all the starting vertices, graph :: graph is the normal graph and transposed :: graph is the transposed graph.
Why do I not provide a transpose function? Because that might just be const transposed in case I have the transposed graph, if not I just transpose the graph before passing it to topoSort. I lose nothing by doing it this way. It's your responsibility to make sure that transposed is indeed the transpose of graph. Since you pass transpose as an argument you have no guarentee that it's well behaved. (You could create types to more or less guarentee the transposed is actually the transpose, but what's the benefit, really? You almost have to make an effort to mess up.)
subGraph :: forall a f graph. Ord a => Foldable f
=> (a -> f a -> graph -> graph)
-> graph
-> (a -> f a)
-> List a
-> graph
subGraph addEdges empty adjacent vertices =
go {visited: Set.empty, queue: vertices, graph: empty}
where
go :: {queue :: List a, graph :: graph, visited :: Set a} -> graph
go {graph, queue: List.Nil} = graph
go {graph, visited, queue: List.Cons vertex queue} =
if Set.member vertex visited then
go {visited, queue, graph}
else
let
neighbors = adjacent vertex
in
go
{ visited: Set.insert vertex visited
, queue: List.fromFoldable neighbors <> queue
, graph: addEdges vertex neighbors graph }adjacentcan trim a graph on the fly, no need to create a new graph.graphneed not be a graph!
reach :: forall a f. Ord a => Foldable f
=> (a -> f a)
-> a
-> Set a
reach adjacent start =
subGraph addEdges Set.empty adjacent (List.fromFoldable (adjacent start))
where
addEdges vertex _ vertices = Set.insert vertex verticesCompare with a type class version of subGraph.
subGraph' :: forall a f graph. Ord a => Foldable f
=> { addEdges :: a -> f a -> graph -> graph
, member :: a -> graph -> Boolean
, empty :: graph }
-> (a -> f a)
-> List a
-> graph
subGraph' {addEdges, member, empty} adjacent vertices =
go {graph: empty, queue: vertices}
where
go :: {queue :: List a, graph :: graph} -> graph
go {graph, queue: List.Nil} = graph
go {graph, queue: List.Cons vertex queue} =
if member vertex graph then -- CHANGED
go {queue, graph}
else
let
neighbors = adjacent vertex
in
go
{ queue: List.fromFoldable neighbors <> queue
, graph: addEdges vertex neighbors graph }- No
visited; instead sees if the vertex is a member of the graph. emptyneed not be empty.- Can incrementally add to a subgraph without doubled work.
- Comment: consider first taking the subgraph starting at vertex 5 in the example above, and then take the subgraph starting at vertex 7 and then merge them. Vertices 11, 2, 9, and 10 will be doubly visited.
What if the graph implementation is slow for member querying?
subGraph addEdges' empty' adjacent vertices =
_.graph $ subGraph' {addEdges, member, empty} adjacent vertices
where
empty = {visited: Set.empty, graph: empty'}
addEdges vertex neighbors {visited, graph} =
{ visited: Set.insert vertex visited
, graph: addEdges' vertex neighbors graph }
member vertex {visited} = Set.member vertex visitedAnother trick, if you wanted a topological sort afterwards, is to also construct the transpose, rename visisted to vertices, and remove _.graph $ and then pass the return value directly to topoSort.
type IO a = Eff (console :: CONSOLE) a
addVertex :: forall a. Show a => a -> IO Unit -> IO Unit
addVertex a eff = do
eff
log $ "Added vertex: " <> show a
addEdge :: forall a. Show a => a -> a -> IO Unit -> IO Unit
addEdge a b eff = do
eff
log $ "Added edge: " <> show a <> " -> " <> show b
main = do
let empty' = DoubleGraph.empty $ pure unit
let addVertex' = DoubleGraph.updateVertex addVertex
let addEdge' = DoubleGraph.updateEdge addEdge
let dg =
empty'
# addEdge' 1 2
# addVertex' 3
# addEdge' 1 3
log "Graph:"
DoubleGraph.graph dg
log "Transposed:"
DoubleGraph.graph $ DoubleGraph.transpose dgGraph:
Added edge: 1 -> 2
Added vertex: 3
Added edge: 1 -> 3
Transposed:
Added edge: 2 -> 1
Added vertex: 3
Added edge: 3 -> 1
The point of the previous slide is to show that you do not need to think of a graph data type in the usual sense, and this is also an example of why you do not need g a and only g in the definition of DoubleGraph.
One can also ponder what would be required to do this if DoubleGraph was a type class. You would have to provide a Graph instance for IO, but probably you would create a newtype instead. Also notice that this is a write-only double graph.
By only using functions we get a lot of flexibility! Who could have guessed that functions were so useful! ;-)
Thank you!
github.com/jolod
