Important update: I’ve re-evaluated the Twitter data and came up with what I think is a much more accurate representation of what’s going on.
Further update: The Twitter data is bad, bad, bad, rendering Andy’s post and most of this here post useless. Both jumps in Twitter activity in Nov 2006 and March 2007 are artificial in nature. See here for an update.
Update: A commenter noted that sometime in mid-March, Twitter stopped using sequential IDs. So that big upswing that the below graphs currently show is partially artificial. I’m attempting to correct now. This is the danger of doing this type of analysis with “data” instead of data.
In mid-March, Andy Baio noted that Twitter uses publicly available sequential message IDs and employed Twitter co-founder Evan Williams’ messages to graph the growth of the service over the first year of its existence. Williams co-founded Blogger back in 1999, a service that, as it happens, also exposed its sequential post IDs to the public. Itching to compare the growth of the two services from their inception, I emailed Matt Webb about a script he’d written a few years ago that tracked the daily growth of Blogger. His stats didn’t go back far enough so I borrowed Andy’s idea and used Williams’ own blog to get his Blogger post IDs and corresponding dates. Here are the resulting graphs of that data.1
The first one covers the first 253 days of each service. The second graph shows the Twitter data through May 7, 2007 and the Blogger data through March 7, 2002. [Some notes about the data are contained in this footnote.]
As you can see, the two services grew at a similar pace until around 240 days in, with Blogger posts increasing faster than Twitter messages. Then around November 21, 2006, Twitter took off and never looked back. At last count, Twitter has amassed five times the number of messages than Blogger did in just under half the time period. But Blogger was not the slouch that the graph makes it out to be. Plotting the service by itself reveals a healthy growth curve:
From late 2001 to early 2002, Blogger doubled the number of messages in its database from 5M to 10M in under 200 days. Of course, it took Twitter just over 40 days to do the same and under 20 days to double again to 20M. The curious thing about Blogger’s message growth is that large events like 9/11, SXSW 2000 & 2001, new versions of Blogger, and the launch of blog*spot didn’t affect the growth at all. I expected to see a huge message spike on 9/11/01 but there was barely a blip.
The second graph also shows that Twitter’s post-SXSW 2007 growth is real and not just a temporary bump…a bunch of people came to check it out, stayed on, and everyone messaged like crazy. However, it does look like growth is slowing just a bit if you look at the data on a logarithmic scale:
Actually, as the graph shows, the biggest rate of growth for Twitter didn’t occur following SXSW 2007 but after November 21.
As for why Twitter took off so much faster than Blogger, I came up with five possible reasons (there are likely more):
1. Twitter is easier to use than Blogger was. All you need is a web browser or mobile phone. Before blog*spot came along in August 2000, you needed web space with FTP access to set up a Blogger blog, not something that everyone had.
2. Twitter has more ways to create a new message than Blogger did at that point. With Blogger, you needed to use the form on the web site to create a post. To post to Twitter, you can use the web, your phone, an IM client, Twitterrific, etc. It’s also far easier to send data to Twitter programatically…the NY Times account alone sends a couple dozen new messages into the Twitter database every day without anyone having to sit there and type them in.
3. Blogger was more strapped for cash and resources than Twitter is. The company that built Blogger ran out of money in early 2001 and nearly out of employees shortly after that. Hard to say how Blogger might have grown if the dot com crash and other factors hadn’t led to the severe limitation of its resources for several key months.
4. Twitter has a much larger pool of available users than Blogger did. Blogger launched in August 1999 and Twitter almost 7 years later in March 2006. In the intervening time, hundreds of millions of people, the media, and technology & media companies have become familiar and comfortable with services like YouTube, Friendster, MySpace, Typepad, Blogger, Facebook, and GMail. Hundreds of millions more now have internet access and mobile phones. The potential user base for the two probably differed by an order of magnitude or two, if not more.
5. But the biggest factor is that the social aspect of Twitter is built in and that’s where the super-fast growth comes from. With Blogger, reading, writing, and creating social ties were decoupled from each other but they’re all integrated into Twitter. Essentially, the top graph shows the difference between a site with social networking and one largely without. Those steep parts of the Twitter trend on Nov 21 and mid-March? That’s crazy insane viral growth2, very contagious, users attracting more users, messages resulting in more messages, multiplying rapidly. With the way Blogger worked, it just didn’t have the capability for that kind of growth.
A few miscellaneous thoughts:
It’s important to keep in mind that these graphs depict the growth in messages, not users or web traffic. It would be great to have user growth data, but that’s not publicly available in either case (I don’t think). It’s tempting to look at the growth and think of it in terms of new users because the two are obviously related. More users = more messages. But that’s not a static relationship…perhaps Twitter’s userbase is not increasing all that much and the message growth is due to the existing users increasing their messaging output. So, grain of salt and all that.
What impact does Twitter’s API have on its message growth? As I said above, the NY Times is pumping dozens of messages into Twitter daily and hundreds of other sites do the same. This is where it would be nice to have data for the number of active users and/or readers. The usual caveats apply, but if you look at the Alexa trends for Twitter, pageviews and traffic seem to leveling out. Compete, which only offers data as recently as March 2007, still shows traffic growing quickly for Twitter.
Just for comparison, here’s a graph showing the adoption of various technologies ranging from the automobile to the internet. Here’s another graph showing the adoption of four internet-based applications: Skype, Hotmail, ICQ, and Kazaa (source: a Tim Draper presentation from April 2006).
[Thanks to Andy, Matt, Anil, Meg, and Jonah for their data and thoughts.]
 Some notes and caveats about the data. The Blogger post IDs were taken from archived versions of Evhead and Anil Dash’s site stored at the Internet Archive and from a short-lived early collaborative blog called Mezzazine. For posts prior to the introduction of the permalink in March 2000, most pages output by Blogger didn’t publish the post IDs. Luckily, both Ev and Anil republished their old archives with permalinks at a later time, which allowed me to record the IDs.
The earliest Blogger post ID I could find was 9871 on November 23, 1999. Posts from before that date had higher post IDs because they were re-imported into the database at a later time so an accurate trend from before 11/23/99 is impossible. According to an archived version of the Blogger site, Blogger was released to the public on August 23, 1999, so for the purposes of the graph, I assumed that post #1 happened on that day. (As you can see, Anil was one of the first 2-3 users of Blogger who didn’t work at Pyra. That’s some old school flavor right there.)
Regarding the re-importing of the early posts, that happened right around mid-December 1999…the post ID numbers jumped from ~13,000 to ~25,000 in one day. In addition to the early posts, I imagine some other posts were imported from various Pyra weblogs that weren’t published with Blogger at the time. I adjusted the numbers subsequent to this discontinuity and the resulting numbers are not precise but are within 100-200 of the actual values, an error of less than 1% at that point and becoming significantly smaller as the number of posts grows large. The last usable Blogger post ID is from March 7, 2002. After that, the database numbering scheme changed and I was unable to correct for it. A few months later, Blogger switched to a post numbering system that wasn’t strictly sequential.
 “Crazy insane viral growth” is a very technical epidemiological term. I don’t expect you to understand its precise meaning. ↩