Wednesday, June 6, 2007

Solving Collatz Sequences

Steve, over at cod3po37ry, is learning Haskell and working through some contest problems, starting with the Collatz sequence. Steve doesn't mention it explicitly, but this is an interesting unsolved problem in mathematics. The problem starts out with a simple function over a single positive integer n:
• if n is even, halve it
• if n is odd, triple it and add one
A sequence can be generated by starting with some positive integer n, generating the result of this function, and continually applying this function on the series of results until it produces a value of 1:
`f 2 = [2,1]f 3 = [3,10,5,16,8,4,2,1]f 4 = [4,2,1]f 5 = [5,16,8,4,2,1]f 6 = [6,3,10,5,16,8,4,2,1]f 7 = [7,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1] `
It's an open question as to whether or not this function always converges on a value of 1 for any positive integer n. Wikipedia claims that this property has been proven up through at least 10 × 258, but it's unclear if it's true for all positive integers.

Steve's solution is a little, um, cumbersome, and he makes it clear that his goal is to learn:
This post follows the Law of Usenet: if you post a question, nobody will answer it. If you post an answer, they'll all rush to correct it. I welcome corrections.
In that spirit, here is my response. :-)

Unlike most languages I have used, Haskell has a very interesting property -- if you find yourself writing a lot of code, chances are you're doing something wrong. Generally, this is because the Prelude and the standard library have a rich set of tools for building solutions from the bottom up. Writing a lot of code usually means that you're ignoring a key abstraction somewhere. Haskell programs tend to be short because they can be short; many common problems have generalized solutions in the library already. If you find a general problem that's not solved in the standard library, implementing a general solution typically involves writing one missing function.

The first step to solving this problem involves the collatz function which implements the interesting 3n + 1 property:
`collatz :: Int -> Intcollatz 1 = 1collatz n = if (odd n)             then (3 * n + 1)             else n `div` 2`
Next, there's the function to produce a "collatz sequence" starting with a number n and (hopefully) terminating with the number 1. Fortunately, this behavior is precisely what the iterate function in the Prelude provides:
`iterate :: (a -> a) -> a -> [a]`
That is, iterate takes a function and a seed value, and returns a list that contains the seed, and an infinite sequence of values produced by applying the function to the previous value.

Therefore, expanding this simple collatz function to a function that produces a collatz sequence is simply:
`collatzSequence :: Int -> [Int]collatzSequence = iterate collatz`
Actually, that's not quite correct, since a collatz sequence terminates with a value of 1:
`collatzSequence :: Int -> [Int]collatzSequence = terminate . iterate collatz where   terminate (1:_) =    terminate (x:xs) = x:terminate xs`
Sure enough, this function works as expected:
`*Main> collatzSequence 7[7,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1]`
That solves the problem of generating collatz sequences. But the contest problem was to find the longest collatz sequence within a range of numbers. Given:
`1 10100 200201 210900 1000`
produce this output:
`1 10 20100 200 125201 210 89900 1000 174`
Generating these results is simple enough. First, convert a collatz sequence to its length:
`*Main> length \$ collatzSequence 717`
Next, convert a range of integers into their collatz lengths:
`*Main> map (length . collatzSequence) [1..10][1,2,8,3,6,9,17,4,20,7]`
Next, pick out the largest sequence in the range:
`*Main> maximum \$ map (length . collatzSequence) [1..10]17`
Next, consume a line of input, perform this calculation, and produce a line of output:
`run :: String -> IO ()run s = do let (i:j:_) = map read (words s)              m = maximum \$ map (length . collatzSequence) [i..j]          putStrLn (concat [show i, " ", show j, " ", show m])`
And finally, write the main function that consumes all input and produces the desired output:
`main = do inp <- getContents           mapM_ run (lines inp)`
Here is the completed program in its entirety:
`collatz :: Int -> Intcollatz 1 = 1collatz n = if (odd n)              then (3 * n + 1)              else n `div` 2collatzSequence :: Int -> [Int]collatzSequence = terminate . iterate collatz  where    terminate (1:_) =     terminate (x:xs) = x:terminate xsrun :: String -> IO ()run s = do let (i:j:_) = map read (words s)               m = maximum \$ map (length . collatzSequence) [i..j]           putStrLn (concat [show i, " ", show j, " ", show m])main = do inp <- getContents          mapM_ run (lines inp)`
Hope this helps!