"
  1. Web 2.0 and Databases Part 1: Second Life: Like everybody else, we started with One Database All Hail The Central Database, and have subsequently been forced into clustering. However, we've eschewed any of the general purpose cluster technologies (mysql cluster, various replication schemes) in favor of explicit data partitioning. So, we still have a central db that keeps track of where to find what data (per-user, for instance), and N additional dbs that do the heavy lifting. Our feeling is that this is ultimately far more scalable than black-box clustering.

  2. Database War Stories #2: bloglines and memeorandum: Bloglines has several data stores, only a couple of which are managed by "traditional" database tools (which in our case is Sleepycat). User information, including email address, password, and subscription data, is stored in one database. Feed information, including the name of the feed, description of the feed, and the various URLs associated with feed, are stored in another database. The vast majority of data within Bloglines however, the 1.4 billion blog posts we've archived since we went on-line, are stored in a data storage system that we wrote ourselves. This system is based on flat files that are replicated across multiple machines, somewhat like the system outlined in the Google File System paper, but much more specific to just our application. To round things out, we make extensive use of memcached to try to keep as much data in memory as possible to keep performance as snappy as possible.

  3. Database War Stories #3: Flickr: tags are an interesting one. lots of the 'web 2.0' feature set doesn't fit well with traditional normalised db schema design. denormalization (or heavy caching) is the only way to generate a tag cloud in milliseconds for hundereds of millions of tags. you can cache stuff that's slow to generate, but if it's so expensive to generate that you can't ever regenerate that view without pegging a whole database server then it's not going to work (or you need dedicated servers to generate those views - some of our data views are calculated offline by dedicated processing clusters which save the results into mysql)."

Dare