Sunday, May 24, 2009

A really fundamental data structure

There are many things I wish my textbooks had taught me. It's often not the fault of the authors: in any field, even the best ideas suffer a lag between discovery and dissemination. However, even accounting for this, it seems I'm always behind the times. Some glaring omissions I belatedly redressed:
But most of all, I bemoan my ignorance of binary decision diagrams (BDDs). In his Computer Musings lecture, Don Knuth states "it's one of the only really fundamental data structures that came out in the last 25 years". He felt it was so important that he devoted the following lecture to ZDDs, a close relative of BDDs.

Take a map of the USA, and consider the following problems:
  1. Suppose you want to visit every state capitol exactly once, traveling on highways. Which route is shortest? Which route is longest? What is the mean and standard deviation of all the routes?
  2. Weight the states by any means, for example, by reading their two-letter code as a number in base 36. Then find a set of states such that no two members of the set are adjacent, and the total weight is maximized.
  3. There are several ways to colour the map with four colours such that no two adjacent states have the same colour. How do we pick one of these colourings at random (uniformly)?
Or consider these:
  1. How many ways can you tile a chessboard using 1x1, 2x1 and 3x1 rectangles? What if we also have pieces that look like 2x2 boxes with one tile missing?
  2. List all 5-letter words that remain words after replacing a 'b' with 'o'.
Do you know how to solve these efficiently? If not, you might want to read up on BDDs, particularly Knuth's treatment, which has just hit the presses, and watch his lectures. Using these sources, I took some notes on ZDDs.

Learning about BDDs reminded me of learning about PĆ³lya theory, another simple yet powerful technique which transforms the seemingly impossible into child's play.

1 comment:

David said...

This is a fantastic post! I wonder what you think of Hopscotch hashing, which came out in 2008 and claims some advantages over Cuckoo hashing. Right now I'm really curious how Hopscotch performs in different contexts vs. crit-bit tries and their relatives...