Tuesday, December 25, 2007

Taxicab Numbers

Futurama is one of my favorite TV shows. It's the quintessential show about nerds, produced by nerds, for nerds, and it's chock full of inside jokes in math, science, sci-fi and pop culture.

I'm happy to note that although Futurama, the series, was cancelled after the fourth season, the show is back with a set of direct-to-DVD features. The first one, Bender's Big Score, is available now, and includes lecture by Sarah Greenwald as a special feature on the various math jokes that have appeared in the show. Some are silly, like π-in-one oil, or Klein's Beer (in a Klein bottle, of course). Others are subtle, like the repeated appearance of taxicab numbers.

Taxicab numbers are an interesting story in and of themselves. The story comes to us from G. H. Hardy, who was visiting Srinivasa Ramanujan:

I remember once going to see him when he was lying ill at Putney. I had ridden in taxi-cab No. 1729, and remarked that the number seemed to be rather a dull one, and that I hoped it was not an unfavourable omen. "No", he replied, "it is a very interesting number; it is the smallest number expressible as the sum of two [positive] cubes in two different ways."

Many of the staff involved in creating Futurama received advanced degrees in math, computer science and physics, so they enjoy making these obscure references, because they know their fans will find them. It shouldn't be surprising that the number 1729 makes many appearances in Futurama, or that the value 87539319 appears as an actuall taxicab number (87539319 is the sum of 2 cubes in 3 different ways).

Of course, the idea of taxicab numbers is interesting, the details less so. After watching the new feature, I wondered what Hardy's taxicab number was, and what two pairs of cubes add up to that value. If I were more mathematically inclined, I'd probably sit down with pencil and paper and just do the math until I found something. If I were truly lazy, I'd just Google for the answer. Instead, I sat down, fired up vim and typed this code up, pretty much as you see it here:

cube x = x * x * x

taxicab n = [(cube a + cube b, (a, b), (c, d))
| a <- [1..n],
b <- [(a+1)..n],
c <- [(a+1)..n],
d <- [(c+1)..n],
(cube a + cube b) == (cube c + cube d)]

I want to find the smallest taxicab number, but searching an unbounded range will not lead to a result. Therefore, it's necessary to search a bounded range. If there are any taxicab numbers among the integers [1..n], then this function will find them. Moreover, if there are many solutions to this problem in that range, this function will find all of them.

I could go into details, but my explanation would not be as clear and precise as the code above.

Here are some results:

*Main> taxicab 10
[]
*Main> taxicab 12
[(1729,(1,12),(9,10))]
*Main> taxicab 20
[(1729,(1,12),(9,10)),(4104,(2,16),(9,15))]

Ordinarily, I wouldn't blog about something so trivial, but as I stared at the code and the output from ghci, it reminded me of an incident in college. The Chudnovsky brothers came up with a new algorithm to compute π that was both very precise and very quick. A professor of mine wrote a quick, throwaway implementation of this algorithm in Maple to demonstrate it. Most of my work in those days was in C. Maple wasn't a tool I'd normally use, but it was easily the best tool for the job.

The lesson my professor was trying to teach me wasn't to use computer algebra systems to indulge a fixation with transcendental numbers. Instead, he was trying to show me that a true computer scientist is presented with a problem, writes a program to solve it, and moves onto the next problem.

I could have written my taxicab number search in C or Java, but I would have lost interest somewhere between #include <stdio.h> or public static void main(String args[]) and firing up the compiler. I could have written the code in Perl, but I would have likely lost interest around the time I saw a nested for loop, and contemplated the four levels of nesting.

Perl programmers know Larry Wall's vision for Perl - to make the easy things easy and the hard things possible. As I stared at this code, I realized that Larry's dictum misses an important cornerstone: to make the trivial things trivial.

Haskell may get a bad rap as making hard things easy and easy things hard, but it does make trivial things trivial. Whether it's list comprehensions solve a problem in a single statement, or parser combinators that make parsing code read like the grammar it implements, there's a lot to be said for a language that gets out of your way and lets you say precisely what you mean.

6 comments:

Anonymous said...

For Pythonistas:

def taxicab(n):
for a in xrange(1, n):
for b in xrange(a+1, n):
for c in xrange(a+1, n):
for d in xrange(c+1, n):
if a**3 + b**3 == c**3 + d**3:
yield a**3 + b**3, (a, b), (c, d)

That builds a generator.

Of course, that also can be done with list comprehensions, borrowed from Haskell:

def taxicab(n):
return [(a**3 + b**3, (a, b), (c, d)) for a in xrange(1, n) for b in xrange(a+ 1, n) for c in xrange(a+1, n) for d in xrange(c+1, n) if a**3 + b**3 == c**3 + d**3]

Anonymous said...

Could not agree more and I wish more people especially from the computer science and math fields read your article. Too many people ignore what they are trying to do and focus on the how - or - get into "religious" wars on which language/editor/compiler/tool to use.

Andrew Dalke said...

Python based its list comprehensions on Haskell so the implementation there is very similar:

def taxicab(n):
  return [(a**3+b**3, (a,b),(c,d))
      for a in range(n)
      for b in range(a+1,n)
      for c in range(a+1,n)
      for d in range(c+1, n)
        if a**3+b**3==c**3+d**3]

>>> taxicab(11)
[]
>>> taxicab(13)
[(1729, (1, 12), (9, 10))]
>>> taxicab(21)
[(1729, (1, 12), (9, 10)), (4104, (2, 16), (9, 15))]
>>>

Anonymous said...

in Python:

def cube(x):
return x * x * x

def taxicab(n):
return [(cube(a) + cube(b), (a, b), (c, d))
for a in range(1, n + 1)
for b in range(a + 1, n + 1)
for c in range(a + 1, n + 1)
for d in range(c + 1, n + 1)
if cube(a) + cube(b) == cube(c) + cube(d)]

Anonymous said...

In this case with a little care you can avoid bounding the search.

Suppose we want a≤b and c≤d.
If we also choose a<c it is easy to see that we must have b>d and this means that a<b (equality isn't possible).

This is enough to write out the solution:

taxi = [(cube a + cube b, (a, b), (c, d))
          | b <- [1..],
             a <- [1..b-1],
             c <- [a+1..b-1],
             d <- [c..b-1],
             cube a + cube b == cube c + cube d]

Yin said...

Taxicab numbers is a really interesting and non-trivial problem, and this is a good chance for demonstrating the power of Haskell.

But your code isn't the best way to do it, because any language even assembly can do the same thing, which is essentially nested for-loops. Writing this in C is also fun and quite throwaway.

#include <stdio.h>
#include <stdlib.h>

int cubesum (int x, int y)
{
return x*x*x + y*y*y;
}

int main(int ac, char **av)
{
int n = 20;
if (ac > 0) n = atoi(av[1]);

for (int a = 1; a <= n; a++)
for (int b = a+1; b <= n; b++)
for (int c = a+1; c <= n; c++)
for (int d = c+1; d <= n; d++) {
int ab = cubesum(a,b);
if (ab == cubesum(c,d))
printf("(%d,(%d,%d),(%d,%d))\n", ab, a, b, c, d);
}
}

$ ./a.out 20
(1729,(1,12),(9,10))
(4104,(2,16),(9,15))

Trivial things are also trivial in C or any other language.

The point here is that: don't let our emotions be against other languages and restrict the ways of thinking.

Haskell is much higher level than C, and it lets us use much more abstraction, but it hides so much about complexity, as if everything is for free. But there is no free lunch, especially when we stop to reason about it! Thinking about how you could write the program in C would help you know the complexity and clarify your ideas. You don't need to actually write it down, but you will be much happier when you have proved that the few lines of Haskell code actually achieves the same complexity with the best C code you could write for the same problem.

I doubt you can get the third taxicab number(87539319) in reasonable time, because with for-loops you will have O(n^6) complexity.

The point here is that: thinking in a restricted way of patterns, list comprehensions, generics will prevent us from discovering some natural solution that could be discovered if we were using a low level language such as C. To see how this works, after thinking in C (or matlab) for some time you will find that we don't actually need to compare every possible pairs of numbers. The cube sums x^3+y^3 actually form such a matrix:

2 9 28 65 126 217 ...
9 16 35 72 133 224 ...
28 35 54 91 152 243 ...
65 72 91 128 189 280 ...
126 133 152 189 250 341 ...
217 224 243 280 341 432 ...
... ... ... ... ... ... ...


If we do a contour plot with some tool like Matlab or Mathematica, we will see that the numbers are ascending from north-west to south-east like parabolic waves. We just need to use some data structure to keep the wavefront in a sorted manner and enumerate the pairs in the direction the wave propagates, and try to find two-in-a-rows (or three-in-a-rows) with the same cube sum. This will reduce time complexity to O(n^2*log(n)). This is still pretty bad and it will fail to find the fourth taxicab number, but it can find the third one in seconds. It will really take some effort to find the fifth one or prove that taxicab(n) exists for every n!

To program this accelerated algorithm with a conventional language isn't too hard, but it could take quite some time for me to fiddle with my home-made heaps or balanced trees. Only in this case will Haskell help.