Joachim Breitner's Homepage
Infinite loops in Haskell
Just some small thoughts about cyclic lists in Haskell. Every Haskell beginner knows that you can use infinite lists, as long as you don’t fully evalute them. So, this is perfectly valid
endless = [0..] -- All natural numbers! main = print (endless !! 10)
It will not crash, but print "10" as the list has not been fully used.
What happens if we take a piece of the list further down, let’s say at position 1000000000:
endles = [0..] main = print (endless !! (10^9) )
If you try this at home, better run "ulimit -S -v 1000000
" before, because then you’ll get "test: out of memory (requested 1048576 bytes)
" instead of a sluggishly swapping machine. What happened? The long list will be evaluated, and fills the memory.
Does this mean that we can not use arbitrary long lists? Let’s try a special case: A list that is infinitely long, but repeating the same values all over again:
list = "Oh, happy day! Oh, happy day. "
cycle' list = list ++ cycle' list
main = do let rlist = cycle' list
print ( rlist !! (10^9), rlist !! 0 )
We repeat the (finite) list endlessly, and then try to pick the 1000000000th element. We also pick the first element again, to make sure the compiler does not cheat by forgetting the first 999999999 elements (It’s actually pretty nice that the compiler will forget these elements, but not what I want to demonstrate here). Running that code, sure enough, fills up the memory.
But maybe I was coding badly. At least I have re-implemented a function that already exists, which is bad practice. Let’s try with Haskell’s own cycle:
list = "Oh, happy day! Oh, happy day. "
main = do let rlist = cycle list
print ( rlist !! (10^9), rlist !! 0 )
Now it takes a while, but surprisingly, the memory gauge does not skyrocket, and in the end I’m told that the 1000000000th character in my infinite character string is 'd'. This leads me to the conclusion that the Haskell library uses black magic. Or does it? Here is the definition of cycle:
cycle xs = xs' where xs' = xs ++ xs'
What is the difference to our cycle'? Here, the result is given a name (xs'
), which is used again inside the function. So while our cycle' appends the list over and over again, filling up the RAM, their cycle ties a loop and makes the end of the list refer to it’s beginning. And my list lookup will no longer evaluate the list up to infinity, but just run around in, well, cycles until it has counted down from 1000000000. I could even ask for the last element of this list, and it will not use any more RAM than a small, finite list, while endlessly searching for the end of the list.
Despite Haskell being a very high level language, I sometimes wonder how my data will look like on the physical memory. And as you can see, it can make a difference. Some more thoughts on this were written down by Duncan Coutts.
Comments
It was interesting to re-discover my thoughts from five years ago.
It still seems to be an open topic.
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.