Judging by the grand total of two comments on last month’s Mystic Statistics Heuristics post, we’re guessing that there aren’t any remaining questions about what we do here at FeedBurner to calculate all those media metrics we provide to our publishers. Or, maybe the post was just too darned long and boring and you're all tired of Dick's jokey post titles. We'll assume the latter since we know there's a lot of continuing curiosity about how stats work, how we count, what doesn't count, what does count, etc.
This post is the first in a regular series, "What's Up With That?", from the Publisher Services Team to transparently describe how FeedBurner works and what we’ve learned about the wonderful world of distributed media measurement. If you have a topic you’d like to see us tackle, send us an e-mail at publishers@feedburner.com, or just post a question on your blog and let us know.
In the past couple of weeks, there's been a fair amount of discussion about subscriber counts on a few sites. In particular, publishers have been concerned about how FeedBurner reports subscriber counts from aggregators, and how FeedBurner deals with weird counts that pop up from third party aggregators. In all of these cases, FeedBurner was (and is, and will be, and should be) in direct contact with the aggregators to resolve anomalies as soon as possible. We thought now would be a good time to dive into how these aggregators communicate their numbers and what we can do with that info.
Collecting the Data
Web-based aggregators (think My Yahoo!, Newsgator Online, Bloglines, Netvibes, PageFlakes, Rojo, etc.) have a simple mechanism for reporting how many of their users are subscribed to a feed: each time they request whether a FeedBurner-managed feed has new content, they pass through a subscriber count in the user agent, and we reflect that number in your Subscriber Report (under "Analyze" in your FeedBurner Dashboard). These services act as a proxy for their users (which is to say, they get the feed once on behalf of however many subscribers they have), rather than count feed requests from individual end users. This is an efficient process that makes feed retrieval faster for everyone, but it also means that it is incumbent on these web-based services to accurately report subscriber numbers.
Now, a quick note about what these numbers do and don’t mean: first off, these numbers do not reflect unique subscribers. In other words, someone who really, really loves your content and who wants to read it in My Yahoo!, Bloglines and Pageflakes will be counted 3 times. Secondly, not all web-based aggregators count the same way. Specifically, when My Yahoo! says you have 450 subscribers, that’s actually a 30-day rolling average of how many active subscribers you have. That means those 450 individuals have viewed a My Yahoo page containing your feed within the last 30 days. Many other aggregators, like Bloglines for example, report a cumulative number of registered users all time that have subscribed to your feed. If you created a Bloglines account four years ago, subscribed to your own feed and never went back, you’re still going to show up in the subscriber number that Bloglines reports to FeedBurner. Finally, some aggregators don’t yet report their subscriber data, which means there is no way to independently determine how many subscribers use that service.
When new web-based aggregators get close to launching, they often reach out to FeedBurner to determine the "best" way to reflect their subscriber numbers. Ideally, we'd like to see the number they report to be a reflection of "active" users, but that's a more complex issue for the aggregator and it's great if they get started by reporting registered users that have subscribed to the feed at some point. Some aggregators create default subscriptions for certain feeds in certain categories, and thus some publishers may find they have an inordinate number of subscribers at a certain aggregator. This is common so that subscribers can get starter kits of feeds in certain cases.
Interpreting the Data
When it comes to actually understanding how many people are reading your content, we have answers. In addition to the big-picture subscriber number, we also measure other feed-related activity like clickthroughs and item views. Item views in particular are helpful in determining whether those 37 million readers as reported to you by some service are legit: for publishers who want to track readership of actual feed items, we measure the number of times each feed item is read. If you only ever see 3 item views on any post in your feed from that aggregator, it’s a safe bet that the aggregator in question is over-reporting its numbers. Another way that we've addressed this issue of feed activity is with our "Reach" metric - the total number of people who have taken action — viewed or clicked — on the content in your feed. Reach is calculated by the unique number of IP addresses that viewed or clicked on content in your feed in a given day. (Note: Item Views and Reach are included with our TotalStats service.)
Break Glass In Case of Emergency
We manage well over 500,000 feeds, and see hundreds of millions of feed requests a day. Over the years, we’ve built up a number of filters to alert us when something seems out of whack. (Yes, in the feed management world, "out of whack" is a term of art.) One example is today's post over at the Publisher Tips blog where a couple publishers discovered an anomaly with Rojo's reporting on some feeds. We spoke to the great folks at Rojo about it, they know about the issue, and hopefully everything will be resolved in short order. Nevertheless, things can slip by and the best and fastest way for us to come up to speed on an issue is for publishers to let us know. If you see a situation where the numbers don't make any sense, the best course of action is to post in our support forums (frequent forums users will note that we respond there consistently and rapidly); alternatively, you can post something on your own blog and send us a link. We'll dive right in and get you an answer right away.