Private Information

A few days ago, Godin made the interesting points that:

On the internet, some people are more influential than others
It would be valuable to know how influential someone is
Quantitative metrics for influence could be developed and computed

So far, so good. Where I think he goes astray is in his claim that “the web knows” who the influential people are. This, it seems to me, is a deeply misleading metaphor.

Pagerank

There are some things, I think, which it is fair to say “the web knows”. I’d say that “the web knows” which pages are authoritative, in the sense of the PageRank algorithm. This statement seems justifiable because the information required to produce PageRank is:

Distributed, and
Public

Therefore, it is not practicably deniable to anyone who wants to collect all (or, at least, the vast majority) of it.

Twitter (For Example)

On the other hand, “the web” does not know how many Twitter followers you have. Even you don’t know. Twitter knows. And, for now, it will tell you, and anyone else who might want to know. However, it would be quite easy for Twitter to refuse this data to anyone it chose – and, indeed, it would likely refuse this data to anyone requesting it in bulk today.

Privacy

When it comes to other forms of influence, you run into another problem: that old devil privacy. For instance: it would be nice to know who reads your blog, and who reads their blogs, etc. Unfortunately, this information isn’t easy to come by, due to the culture of the web and the nature of the HTTP protocol. About the only way you could determine this information would be with a fairly large, fairly invasive cookie-tracking scheme, which:

Would annoy people
Would, again, concentrate the relevant data in the hands of a few firms

(Actually, maybe you could kinda-sorta cobble something like this together, based on comments. Assuming that comments are enabled, everyone comments, all commenters recognizably identify themselves and their blogs, et-bloody-cetera.)

The Web Doesn’t Know

In short, “the web” doesn’t know who the influential people are. That information is mostly dispersed between a relatively few firms. (E.g. FeedBurner/Google, FeedBlitz, Twitter, Facebook, Doubleclick/Google – this is nothing like exhaustive, of course.) These firms might share their pieces of this information on a small scale, but can and probably would deny it to anyone trying to access it on the scale required to build an influence graph.

Influence metrics are a good and viable idea, but it’s a misconception that the data required to build them are just lying out there on the web, waiting for someone to pick them up. Some of them are locked away in corporate vaults, and shared only a piece at a time, others can be derived only through questionably ethical means.

Share and Enjoy:

Pagerank

Twitter (For Example)

Privacy

The Web Doesn’t Know

Services

Find Stuff

Pages

Buy My Apps

Other Stuff I’ve Built

Book Club

Archives

Categories

Blogroll