Unlikely Voter

Conservative views on polls, science, technology, and policy

Using Twitter to replace genuine random sampling?

Some researchers led by Noah Smith at Carnegie Mellon tried an experiment: Could they predict the results of traditional polling, which has as its core feature a genuine random sample of people, with careful monitoring of Twitter?

They think they got close, but I don’t think their results sound close enough to be useful. Apparently they got Twitter to have a 79% correlation with the Gallup tracking poll of Barack Obama’s Presidential approval. 79% sounds ok, but really it’s not.

Polling is about more than seeing what direction the numbers move. Magnitudes matter, and a rough correlation isn’t going to give magnitudes.

Further, as they point out, the actual uses of this are extremely limited. It turns out that during the 2008 Presidential election, mentions of both “Obama” and “McCain” on Twitter were correlated with Obama’s polling popularity. So really they have no way to judge how popular McCain was by Twitter, it seems.

It seems to me they’re not measuring opinions, but just excitement. Obama supporters were excited when he was doing well, and depressed when he was doing poorly. On any issue where youth and Internet use don’t distinguish the people on two sides of an issue, as in the 2008 Presidential election, this analysis will be absolutely useless.

You just can’t beat a random sample.


No Responses to “Using Twitter to replace genuine random sampling?”

Write a Comment

Comments are closed.