Joachim Breitner's Homepage
More recursive definitions
Haskell is a pure and lazy programming language, and the laziness allows us to write some algorithms very elegantly, by recursively referring to already calculated values. A typical example is the following definition of the Fibonacci numbers, as an infinite stream:
= 0 : 1 : zipWith (+) fibs (tail fibs) fibs
Elegant graph traversals
A maybe more practical example is the following calculation of the transitive closure of a graph:
import qualified Data.Set as S
import qualified Data.Map as M
type Graph = M.Map Int [Int]
transitive1 :: Graph -> Graph
= M.map S.toList sets
transitive1 g where
sets :: M.Map Int (S.Set Int)
= M.mapWithKey (\v vs -> S.insert v (S.unions [ sets M.! v' | v' <- vs ])) g sets
We represent graphs as maps from vertex to their successors vertex, and define the resulting map sets
recursively: The set of reachable vertices from a vertex v
is v
itself, plus those reachable by its successors vs
, for which we query sets
.
And, despite this apparent self-referential recursion, it works!
> transitive1 $ M.fromList [(1,[3]),(2,[1,3]),(3,[])]
ghci1,[1,3]),(2,[1,2,3]),(3,[3])] fromList [(
Cyclic graphs ruin it all
These tricks can be very impressive … until someone tries to use it on a cyclic graph and the program just hangs until we abort it:
> transitive1 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
ghci1,fromList ^CInterrupted. fromList [(
At this point we are thrown back to implement a more pedestrian graph traversal, typically keeping explicit track of vertices that we have seen already:
transitive2 :: Graph -> Graph
= M.fromList [ (v, S.toList (go S.empty [v])) | v <- M.keys g ]
transitive2 g where
go :: S.Set Int -> [Int] -> S.Set Int
= seen
go seen [] :vs) | v `S.member` seen = go seen vs
go seen (v:vs) = go (S.insert v seen) (g M.! v ++ vs) go seen (v
I have written that seen
/todo
recursion idiom so often in the past, I can almost write it blindly And indeed, this code handles cyclic graphs just fine:
ghci> transitive2 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
fromList [(1,[1,2,3]),(2,[1,2,3]),(3,[3])]
But this is a bit anticlimactic – Haskell is supposed to be a declarative language, and transitive1
declares my intent just fine!
We can have it all
It seems there actually is a way to write essentially the code in transitive1
, and still get the right result in all cases, and I have just published a possible implementation as rec-def
. In the module Data.Recursive.Set
we find an API that resembles that of Set
, with a type RSet a
, and in addition to conversion functions from and to sets, we find the two operations that we needed in transitive1
:
insert :: Ord a => a -> RSet a -> RSet a
unions :: Ord a => [RSet a] -> RSet a
get :: RSet a -> Set a
Let’s try that:
import qualified Data.Recursive.Set as RS
transitive2 :: Graph -> Graph
= M.map (S.toList . RS.get) sets
transitive2 g where
sets :: M.Map Int (RS.RSet Int)
= M.mapWithKey (\v vs -> RS.insert v (RS.unions [ sets M.! v' | v' <- vs ])) g sets
And indeed it works! Magic!
ghci> transitive2 $ M.fromList [(1,[3]),(2,[1,3]),(3,[])]
fromList [(1,[1,3]),(2,[1,2,3]),(3,[3])]
ghci> transitive2 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
fromList [(1,[1,2,3]),(2,[1,2,3]),(3,[3])]
To show off some more, here are small examples:
ghci> let s = RS.insert 42 s in RS.get s
fromList [42]
ghci> :{
let s1 = RS.insert 23 s2
s2 = RS.insert 42 s1
in RS.get s1
:}
fromList [23,42]
How is that possible? Is it still Haskell?
The internal workings of the RSet a
type will be the topic of a future blog post; let me just briefly mention that it uses unsafe features under the hood, and just keeps applying the equations you gave until a fixed-point is reached. Because it starts with the empty set and all operations provided by Data.Recursive.Set
are monotonous (e.g. no difference
) it will eventually find the least fixed point.
Despite the unsafe machinery under the hood, I claim that Data.Recursive.Set
is itself nicely safe, and does not destroy Haskell’s nice properties like purity, referential transparency and equational reasoning. If you disagree, I’d like to hear about it (here, on Twitter, Reddit or Discourse)! There is a brief discussion at the end of the tutorial in Data.Recursive.Example
.
More than sets
The library also provides Data.Recursive.Bool
for recursive equations with booleans (preferring False
) and Data.Recursive.DualBool
(preferring True
), and some operations like member :: Ord a => a -> RSet a -> RBool
can actually connect different types. I plan to add other data types (natural numbers, maps, Maybe
, with suitable orders) as demand arises and as I come across nice small example use-cases for the documentation (e.g. finding shortest paths in a graph).
I believe this idiom is practically useful in a wide range of applications (which of course all have some underlying graph structure – but then almost everything in Computer Science is a graph). My original motivation was a program analysis. Imagine you want to find out from where in your program you can run into a division by zero. As long as your program does not have recursion, you can simply keep track of a boolean flag while you traverse the program, keeping track a mapping from function names to whether they can divide by zero – all nice and elegant. But once you allow mutually recursive functions, things become tricky. Unless you use RBool
! Simply use laziness, pass the analysis result down when analyzing the function’s right-hand sides, and it just works!
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.