Outliving Outrage on the Public Interest Internet: the CDDB Story

This is the second in our blog series on the public interest internet: past, present and future.

In our previous blog post, we discussed how in the early days of the internet, regulators feared that without strict copyright enforcement and pre-packaged entertainment, the new digital frontier would be empty of content. But the public interest internet barn-raised to fill the gap—before the fledgling digital giants commercialised and enclosed those innovations. These enclosures did not go unnoticed, however—and some worked to keep the public interest internet alive.

Compact discs (CDs) were the cutting edge of the digital revolution a decade before the web. Their adoption initially followed Lehman’s rightsholder-led transition – where existing publishers led the charge into a new medium, rather than the user-led homesteading of the internet. The existing record labels maintained control of CD production and distribution, and did little to exploit the new tech—but they did profit from bringing their old back catalogues onto the new digital format. The format was immensely profitable, because everyone re-bought their existing vinyl collections to move it onto CD. Beyond the improved fidelity of CDs, the music industry had no incentive to add new functionality to CDs or their players. When CD players were first introduced, they were sold exclusively as self-contained music devices—a straight-up replacement for record players that you could plug into speakers or your hi-fi “music centre,”  but not much else. They were digital, but in no way online or integrated with any other digital technology.

The exception was the CD playing hardware that was incorporated into the latest multimedia PCs—a repurposing of the dedicated music playing hardware which sent the CD to the PC as a pile of digital data. With this tech, you could use CDs as a read-only data store, a fixed set of data, a “CD-ROM”; or you could insert a CD music disc, and use your desktop PC to read in and play its digital audio files through tinny desktop speakers, or headphones.

The crazy thing was that those music CDs contained raw dumps of audio, but almost nothing else. There was no bonus artist info stored on the CDs; no digital record of the CD title, no digital version of the CD’s cover image JPEG, not even a user-readable filename or two: just 74 minutes of untitled digital sound data, split into separate tracks, like its vinyl forebear. Consequently, a PC with a CD player could read and play a CD, but had no idea what it was playing. About the only additional information a computer could extract from the CD beyond the raw audio was the total number of tracks, and how long each track lasted. Plug a CD into a player or a PC, and all it could tell you was that you were now listening to Track 3 of 12.

Around about the same time as movie enthusiasts were building the IMDb, music enthusiasts were solving this problem by collectively building their own compact disk database—the CD Database (CDDB). Programmer Ti Kan wrote open source client software that would auto-run when a CD was put into a computer, and grab the number of tracks and their length. This client would query a public online database (designed by another coder, Steve Scherf) to see if anyone else had seen a CD with the same fingerprint. If no one had, the program would pop up a window asking the PC user to enter the album details themselves, and would upload that information to the collective store, ready for the next user to find. All it took was one volunteer to enter the album info and associate it with the unique fingerprint of track durations, and every future CDDB client owner could grab the data and display it the moment the CD was inserted, and let its user pick tracks by their name, peruse artist details, and so on. 

The modern internet, buffeted as it is by monopolies, exploitation, and market and regulatory failure, still allows people to organize at low cost, with high levels of informality.

When it started, most users of the CDDB had to precede much of their music-listening time with a short burst of volunteer data entry. But within months, the collective contributions of the Internet’s music fans had created a unique catalogue of current music that far exceeded the information contained even in expensive, proprietary industry databases. Deprived of any useful digital accommodations by the music industry, CD fans, armed with the user-empowering PC and the internet, built their own solution.

This story, too, does not have a happy ending. In fact, in some ways the CDDB is the most notorious tale of enclosure on the early Net. Kan and Scherf soon realised the valuable asset that they were sitting on, and along with the hosting administrator of the original database server, built it into a commercial company, just as the overseers of Cardiff’s movie database had. Between 2000 and 2001, as “Gracenote”, this commercial company shifted from a free service, incorporated by its many happy users into a slew of open source players, to serving hardware companies, who they charged for a CD recognition service. It changed its client software to a closed proprietary software license, attached restrictive requirements on any code that used its API, and eventually blocked clients who did not agree to its license entirely.

The wider CDDB community was outraged, and the bitterness persisted online for years afterwards. Five years later, Scherf defended his actions in a Wired magazine interview. His explanation was the same as IMDB’s founders: that finding a commercial owner and business model was the only way to fund CDDB as a viable ongoing concern. He noted that other groups of volunteers, notably an alternative service called freedb, had forked the database and client code from a point just before Gracenote locked it up. He agreed that was their right, and encouraged them to keep at it, but expressed scepticism that they would survive. “The focus and dedication required for CDDB to grow could not be found in a community effort,” he told Wired. “If you look at how stagnant efforts like freedb have been, you’ll see what I mean.”  By locking down and commercializing CDDB, Scherf said that he “fully expect[ed] our disc-recognition service to be running for decades to come.”

Scherf may have overestimated the lifetime of CDs, and underestimated the persistence of free versions of the CDDB. While freedb closed last year,  Gnudb, an alternative derived from freedb, continues to operate. Its far smaller set of contributors don’t cover as much of the latest CD releases, but its data remains open for everyone to use—not just for the remaining CD diehards, but also as a permanent historical record of the CD era’s back catalogue: its authors, its releases, and every single track. Publicly available, publicly collected, and publicly usable, in perpetuity. Whatever criticism might be laid at the feet of this form of the public interest internet, fragility is not one of them. It hasn’t changed much, which may count as stagnation to Scherf—especially compared to the multi-million dollar company that Gracenote has become. But as Gracenote itself was bought up (first by Sony, then by Nielsen), re-branded, and re-focused, its predecessor has distinctly failed to disappear.

Some Internet services do survive and prosper by becoming the largest, or by being bought by the largest. These success stories are very visible, if not organically, then because they can afford marketers and publicists. If we listen exclusively to these louder voices, our assumption would be that the story of the Internet is one of consolidation and monopolization. And if—or perhaps just when—these conglomerates go bad, their failings are just as visible.

But smaller stories, successful or not, are harder to see. When we dive into this area, things become more complicated. Public interest internet services can be engulfed and transformed into strictly commercial operations, but they don’t have to be. In fact, they can persist and outlast their commercial cousins.

And that’s because the modern internet, buffeted as it is by monopolies, exploitation, and market and regulatory failure, still allows people to organize at low cost, with high levels of informality, in a way that can often be more efficient, flexible and antifragile than strictly commercial, private interest services,or the centrally-planned government production of public goods.

Next week: we continue our look at music recognition, and see how public interest internet initiatives can not only hang on as long as their commercial rivals, but continue to innovate, grow, and financially support their communities.

