Using AI to turn messy, unstructured data into insights

These days, the most powerful insights surface through a dual approach: using advanced AI tools on unstructured data, in combination with consumer questioning.
11 June 2020
City aerial view at night
Kyle Findlay
Kyle
Findlay

Senior Data Science Director, Innovation

Get in touch

For much of our industry’s history, market researchers have used intentional questioning to generate structured data to derive insights from, but as technology has grown in its ubiquity, we’ve gained access to new unstructured data sources and the resources to turn this messy data into insights.

Today we can leverage Twitter posts, Instagram images, customer feedback verbatims, YouTube videos, Alexa logs, browser search histories and much more (all with permission, of course). And, as COVID-19 hampers some of our traditional data collection channels, other sources have gained an urgent impetus.

These other sources present us with novel challenges though so how do we gain insight from this boon of unstructured data? This is where data science and AI enter the picture. Working with our various teams around the world to figure out how to attack these problems in their myriad forms has been one of the most exciting and novel journeys of my professional career.

Let’s explore what it means to leverage AI to make sense of unstructured data. Extracting insights from unstructured data can be broken down into two parts:

  1. What do we know about in advance that we want to quantify in the data?
  2. How do we surface the things that we don’t know about in advance?

Measuring what we already know about

Market research has concerned itself with quantifying known variables for most of its existence. Most surveys are an attempt to quantify issues we already know about. The quintessence of this paradigm is the attribute list – a pre-defined list of themes that we ask respondents about in order to quantify those themes’ impact in the market.

In the preserve of data science and AI, an attribute list is roughly akin to a taxonomy. These lists are manifestations of our insight into how a category works. It takes an expert – someone with deep knowledge of a category – to put such a list together that is comprehensive and useful. Much of our extended teams’ efforts have gone into making the implicit category knowledge of Kantar’s experts captured in such lists explicit so that our AI can leverage that knowledge.

While we can’t scrape the entire internet like Google or access the world’s social graph like Facebook, we can collate a massive data asset in the form of brand and category taxonomies for our AI to leverage. Thus, we have built the Kantar Brain Taxonomy Tool (referred to colloquially as the “Kantar Brain”), a centralised database that pulls together all our existing taxonomies from a variety of divisions, products, tools and data stores. When combined with the attendant AI tools that leverage this data, we are able to bring to bear the knowledge of an unprecedented global network of brand experts to quickly surface insights around known themes in social media posts, voice of the customer verbatims, chatbots, and wherever else we deal with unstructured data.

For example, I might be conducting a study on ice-cream in the USA using a customer’s database of several million customer feedback verbatims. To kickstart the data analysis, my tools can access the Kantar Brain to see what other ice-cream studies have been conducted. Based on pre-existing ice-cream category knowledge from India, South Africa, Korea and Germany, my tools present me with an initial set of structured insights. These will necessarily miss some of the nuances of my specific market, so I tweak, adapt and expand on the initial outputs, and commit those changes back to the Kantar Brain. In this way, others can benefit from an expanded understanding of what ice-cream means around the world. All of this is done with the aid of AI techniques that help us generalise concepts, draw connections that a human might miss and generally streamline our processes.

Surfacing what we don’t know

The flip side of the unstructured data coin is finding the things that we didn’t know about in advance but which appear in our data; the unanticipated themes that weren’t on our radar ahead of time (like COVID-19, for example). This requires us to leverage our data assets and AI capabilities in a slightly different way.

Thankfully, our tools stand on the shoulders of giants. Everyone playing in the data science field owes a huge debt to companies such as Google, Facebook, OpenAI and others who use their resources to encode huge swathes of human knowledge into models that they release into the public domain. By leveraging these pre-trained public models, our AI tools start with much needed context that we couldn’t give them on our own. It’s then up to us to further ‘fine tune’ the models to our clients’ specific contexts and the market research paradigm in general.

All of this means that we empower our colleagues with tools that are uniquely attuned to how brands work. This helps them to, for example, quickly surface the themes of interest in their customer experience verbatims, identify consumer segments based on Instagram photos, see which themes on social media drive brand equity, algorithmically tie disparate data sources together, create smarter chatbots that automatically infer what a respondent is talking about, and much more.

Market researchers go where the data is

Data trends swell and crest. One thing that has always remained constant though is our need to translate between the language of the consumer and the paradigm of business. Kantar’s investment in data science and pursuit of AI-infused capabilities allows us to answer fundamental brand questions in an ever-more holistic manner by being data agnostic and using the best data available to us.

These investments inform all aspects of market research: sometimes questions can be answered without ever asking a question through the clever use of existing data and AI techniques; often though specific questions need to be asked (but even here, AI is helping us evolve by, for example, allowing chatbots to have ever-more naturalistic conversations with respondents). The most powerful insights come from a melding of both approaches – automatic insight mining and intentional questioning.

To realise the future, we have worked hard to encode the deep knowledge and expertise of our organisation into our AI capabilities so that our machines can empower our people and create valuable insights for our clients in a rapidly changing world. In the process, putting the necessary skills, infrastructure, data assets and capabilities in place has changed what it means to be a market research provider.

Get in touch
Related solutions
Spot new trends before competitors, with our next-generation trend detection programme.
Understand market and category landscapes from customer data with STAN, our analytics AI toolkit for unstructured data.