All the big data crunching that brands love turns out to be irrelevant in China
A view from Russell Davies

All the big data crunching that brands love turns out to be irrelevant in China

My favourite conference is one I never go to. It’s called Strata and it’s about, basically, the intersection of business and "big data".

I like it because it seems to be the moment when a lot of very clever data people realise they need to communicate something to a more general audience and all sorts of thought-provoking and novel ideas pop out around the internet. Videos, articles, blog posts – it’s always worth keeping an eye on. The latest incarnation will be over by the time you read this – have a search for the wake it leaves.

The most intriguing thing recently was an article by a chap called Robert Munro from a company called Idibon. A "computational linguist", his little article on the Strata website has enough "Well, I never knew that" moments for a dozen meetings and hundreds of pub conversations. For instance, according to Idibon’s data, the content of the world’s text messages, in any three-month period, represents more words than in every book ever published. Digital technologies pour out a lot of words. A lot. But they’re still dwarfed by face-to-face speech – based on word count, digital stuff only accounts for 7 per cent of the world’s communications. Well, unless you include spam. There’s more spam in the world than there is spoken English.

Digital stuff only accounts for 7 per cent of the world's communications. Well, unless you include spam

You see? Fascinating, isn’t it? Have another: "If the Facebook ‘like’ was considered a one-word language, it would be in the top 5 per cent most widely spoken languages (although still outside the top 200)." Or this: "Across all the world’s communications, five in every 10,000 words are directed at machines, not people: mainly search engines."

The best bit will be of interest to global brand people – you know all that sentiment analysis and big data crunching and searching you can do with English language conversations? You can’t do that at all, apparently, with Mandarin Chinese – or about a quarter of the world’s data. The written system is just too complex to be parsed by machine. It seems there’s still some room for people, in China at least.

Russell Davies is a creative director at Government Digital Services