blog comments search engine

Many great insights show up in the comment thread of blogs, especially on A-list blogs [FYI, I’m B-list]. I’m surprised that with web technologies multiplying on so many fronts that a search engine hasn’t been built sooner to crawl blog comments. Plus, now a growing number of blogging tools are offering RSS feed for comments, e.g. WordPress, TypePad, Blogger. Those are the “big 3”, imho. I’d like to think that it’d be fairly easy to aggregate those RSS feeds and slap a search engine on it. Just like how a new blog post pings a server, so could a comment RSS feed.

Now there is one called BackType — a search engine that searches blog comments (finally!). Surprisingly, this blog post explains BackType better than its about page:

BackType is a website that lets people find, follow and share comments from across the web. Comments play an important role in social media; there are millions of comments written every single day. BackType’s technology crawls and indexes millions of comments so you can search them by topic or follow those written by the people you care about.

I’m not sure if it uses the algorithm I described above, or how extensive BackType searches the commentsphere (aka commentosphere). It looks like BackType does so much more than coComment, Disqus, or IntenseDebate3 great commenting add-ons that have been around a while to index comments for blogs and commenters that use their service. My guess is that a majority of commenters don’t. [disclosure: I’ve used coComment before, and currently use Disqus here.]

MicroPersuasion‘s Steve Rubel had predicted a comments search engine would launch in 2006. I don’t know of one. [via lackofmotivation]

MicroPersuasion also noted these blog comment stats as Nielsen BuzzMetrics Quantifies the State of the Commentsphere:

… They studied 500 randomly selected blog posts and here’s what they found:

  • 80% of the sample posts allowed comments, but only 28% had them
  • The number of comments in the entire blogosphere is comparable to the number of posts in active, non-spam blogs. Therefore comments constitute up to 30% (150,000) of the daily volume of blog posts (700,000) …
  • Less than 2% of all blog comments are syndicated in feeds
  • The textual size of the commentsphere is 10 to 20% of the blogosphere

Even though 2 years late, I’m just ecstatic that you can search blog comments now!!

You may also like...

8 Responses

  1. Mike Montano says:

    Thanks so much for the mention. We're ecstatic as well and we're working hard to make the service better.

    Perhaps when we find some time we should publish some “updated” statistics about the commentosphere.

  2. clemoine says:


    The issue with comment RSS feeds is that they are limited in size (number of items) and if you have a blog with lots of comment activity, an index server will need to scan the feed very frequently to get all comments: considering the number of sites with commenting, this will endup being an issue.

    Please note that coComment does not index only comments from our users: we also get all the comments from the conversations that our users track + all comments from integrated blogs.

  3. djchuang says:

    @Mike – Thanks for the comment. I do hope to see this genre of comments search engine grow rapidly and soon.

    @clemoine Thanks for the clarification about coComment. My bad. What I couldn't figure out is how I can search the comments that are indexed for keywords or phrases — it looks like search by tags is as close as it gets.

    Didn't know Comments RSS were limited (and I think Blog Posts RSS are probably limited too). That being the case, would it be as simple as a “ping” from the blog comment feed as needed, e.g. if the limit is 50 comments, a ping every 50 comments.

  4. clemoine says:

    True: we are now offering search on title, urls, users, tags…… Search on comment text is coming very soon and you will be able to search in the millions of conversations we have.

    The issue with the ping is that it is treated asynchronously: this mean that if you get some comments between the ping and when the server actually get the feed, some might be lost. But, this said, it will probably work for most blogs, but will not be 100% reliable.
    With blog posts it is not too much an issue as the update frequency is lower (except if you post very very frequently 😉 )


  5. human3rror says:

    mentioned the “pure” breeds in response to your q. good stuff, thanks for the pointer.

  6. shad sluiter says:

    Blog comments are what give spice to the information published. What a nice reality check to find that some people have alternative opinions. Sometimes I finish an article at cnn in total agreement with the author, but then run right into a buzz saw of opposition who bring up good arguments against the author.