All of the firms I have built or worked in have been involved in data in some way. Based on this experience, what really galls me is when people use the wrong data or, even worse, the right data incorrectly.
The events of Thursday’s election were the clearest examples of using the right data incorrectly since 1992 when, despite pre-election polls going the other way, the Conservatives won the General Election, subsequently explained as the "shy" vote in action.
I started Thursday evening at an industry election event before darting around Westminster to a couple of public affairs parties. I was truly astonished by what I saw and heard at the election event. We were treated to a parade of senior consultants from a mixture of research and social listening firms correlating each other’s findings and patting each other on the back for calling a hung parliament. Here’s why they got it so wrong.
Traditional polling is out of date
It’s easy to spot the limitations in traditional qualitative and quantitative research. Survey-based approaches require the individual to interpret their own answer into a number to enable the processing en masse. So one person’s "8" on a scale of 1 to 10 can quite easily be another’s "6", but at least we have a lot of data to look at!
Focus groups, whilst useful for testing new innovation, are fundamentally unscalable and therefore are only representative of a small population; not to mention they need to take into account group social dynamics such as conversation leaders and conversation followers.
We find people often say the opposite of how they actually behave. In the elections US statistician Nate Silver noted that people think strategically but vote tactically on the day, based on concepts such as the perceived chance of candidate. So a voter might be Green at heart but put a cross for the Labour candidate as he or she has the best chance in their constituency.
Social sentiment is snake oil
I’ve been saying this for a while, and even the people selling sentiment have quietened their message: automated sentiment analysis, usually based on social media conversations, simply does not work. It is a statistical flip of a coin.
The technology looks at the individual words and decides whether a message is positive or negative by using predetermined algorithms that analyse the words around them. While this sounds great in theory, it doesn’t take into account any of the great nuances of language, particularly English. Take a word like ‘break.’ How many different meanings do you think it can have? I guessed 12 at the first attempt. The answer is 76! That’s 76 different ways the sentiment can be interpreted, and that’s before we introduce a fascinating aspect of the English language we all love to use: sarcasm.
Detailed research from Zurich University bore this out recently; you may as well flip a coin as use sentiment analysis technology. This algorithmic approach to analysing sentiment is broken, yet was relied upon heavily in predicting the results of the election.
The wrong metrics
People are saying that social media is a left-wing echo chamber and that’s why it was the wrong channel to analyse. Just look at the success of UKIP and even the far-right Britain First on social media to destroy that claim. The problem isn’t the channel; it’s the way it’s being analysed. Measuring success based on likes, retweets, and shares, is simply using vanity metrics for vanity’s sake.
An example of this is the high number of shares for far right-wing posts - we simply love to share bad news! Taking a deeper look at what people actually said about those posts shows it’s generally not great for the party concerned.
So how do we get it right next time?
I think we’re on the cusp of a consumer research revolution, and the election could prove to be the tipping point. I have no doubt in my mind that if we adopt more advanced research methods we can more accurately predict the outcome of the next election.
What if we could observe what people are saying without asking questions, and on a massive scale? Social media is a useful source for these types of conversation, as are online forums, but we need to go beyond simple count-based metrics such as likes, retweets, and shares. We need to look at what people say rather than just that they’re saying something. And we need to forget about sentiment as it stands now.
Certainly in politics there is an opportunity to model conversations for behavioural outcome. Given enough use of language, it is possible to model most characteristics and outcomes. We’re currently researching gender in brand language for example, but it’s entirely feasible and appropriate to model for "propensity to vote Conservative."