According to data jointly released by Kantar and IBM*: There are already 33 million voice-enabled devices installed globally. In the past two years, the accuracy of voice recognition has improved from 90% to 99%. The ownership of smart speakers in the United States has risen from 5% in October, 2016 to 12% in July, 2017.
Looking forward, by 2020, 30% of web browsing will be done without a screen; by 2021, there will be 1.8 billion consumers using a voice-enabled device.
Voice won’t replace keyboards and screens, but it’s as big a shift as the launch of the smartphone was back in 2007. It is the beginning of a revolution in the way people live – including the way they shop.
Is your brand ready for the rise of voice technology? Are you well prepared for the disruption it is going to bring to the world?
Kantar Insights Division Thailand CEO Arpapat Boonrad published a paper about the rise of voice technology and its implication on brands’ relationship with consumers. It won the best paper for ESOMAR APAC meeting in May. Recently, she shared with us many thoughts on this theme by being a guest speaker at Kantar's Future Proof podcast **
1. Voice technology can fit naturally into consumers’ daily life
The main driver for the quick adoption of voice technology is that it fits naturally in people’s lives because it’s seamless and it’s very convenient. Unlike other new human-machine interactive technologies, such as Virtual Reality, consumers don’t need to wear or use any hardware devices, not even type any letter or touch anything, they can carry on with their normal life, but at the same time speak naturally to instruct computing systems (PC or smartphones or Cloud-powered smart speakers) to accomplish certain tasks.
Many of these technologies and devices are originally designed by western companies, so many default settings are designed to serve westerners. But as soon as their contents are localized, the usage among local consumers will soar. For example, MG cars sold in Thailand were originally installed with voice technology system which recognizing only English. Not confident of their English speaking skills, Thai consumers rarely use these features.
As soon as MG rolled out Thai language voice recognition function, more consumers are comfortable to use voice assistant to open the sunroof, open GPS and open air conditioning.
2. Consumers tend to develop emotional connection with voice devices
There are two factors contributing to the unique emotional connection between voice devices and human users. On the one hand, it’s about human learning, because these devices accumulate data about us as a person. They know what we like and they can provide the right response. It’s a device that seems to know you more than you know yourself.
On the other hand, because the interaction with voice devices are so fluid, seamless and natural, people don’t really feel like they are interacting with a machine. We have observed that a lot of consumers in APAC use voice devices like a friend or companion -- they use it when they are bored, heartbroken, feeling down – and the responses, which are sometimes witty and funny, make them feel better.
At an industry conference, British online supermarket Ocado shared that they couldn’t understand why so many people were searching for “Thank you card” through its Amazon Alexa skill. It is only after a while they found out that Alexa was misinterpreting people saying “thank you” to Alexa.
In fact, Amazon itself has taken the human-device connection to the next level: in May, Amazon upgraded its software system so that Alexa would encourage children say “please” and “thank you” to them.
As a natural next step beyond this layer, many consumers will trust and rely on the information provided, or brands recommended, by the voice devices.
3. Some brands will be at huge disadvantage because of the sound of their names
This could be a never-thought-of but huge challenge for many brands. Historically, when design a new brand (take Ford as an example), the first element will be the text:
Next it’s logo:
Then the translations of its brand name in foreign markets, such as in China: 福特
Then the brand slogan that would be rethought and refreshed from time to time.
However, after investing hundreds of millions of dollars in their brands, many of them would soon realize the sound of their brand name will become their biggest disadvantage in the world of voice technology. Some brand names would also be confused with words used in daily conversations, such as Apple vs fruit apple, Visa vs travel visa, Movistar vs movie star.
Kantar testing has already noticed, some alphanumeric brands, such as VO5, will especially struggling to get themselves correctly recognized by the devices. Brands with these issues might need to work directly with the voice device/software manufacturers to find solutions.
4. Brands need to find their unique voice
Because in the world of voice technology, consumers interaction with brands will be done purely through sounds, so brands need to start think NOW:
- Based on my brand’s purpose, propositioning and personality, what should my voice be like?
- How can I be sure my brand voice is unique and reflects what my brand stands for?
- How do I know if my targeted consumers can resonate with my brand voice of choice? Do I need to create voices of different genders so it would fit in different occasions, moods and tones? Nokki’s paper has already confirmed this need, at least for Asian consumers.
- For global brands, do they need to create different voices for different countries/markets? How to create a series of localized brand voices that can be easily recognized as from the same brand?
5. Voice technology shuts out competition, only brands with strong emotional connections with consumers can win
If we revisit the evolution of the shopping list interface along the digital development, we would have to admit consumers are now given fewer and fewer options, or it’s getting increasingly difficult to compare candidate products/services. In PC era, we could open multiple browser screens and switch across product lists from various websites easily. In mobile era, we would basically compare products we found in one or two apps – each mobile screen can display much fewer products, the number of products we found are usually endless, and we cannot easily compare one product against another within an app. It already takes a lot of time to do research within one app, let alone compare across apps.
In the voice assistant era, how many candidate products can consumers listen to before their patience runs out? Even if they could listen to a lot, how many information can they remember?
A more likely scenario is that consumers would directly tell the voice assistant what they want to buy and the assistant will accordingly place an order. Or consumers can generally tell the voice assistant what they are looking for and choose from the assistant’s recommendation. In both scenarios, brands’ power in consumer mind will mandate their possibility of being chosen.
Now there are already many voice devices on the market, even in China we have Tmall Genie, Mi AI Speaker, Baidu Xiaodu AI Speaker, Suning Biu Speaker. With the growth of voice recognition technology, especially the ever-increasing computing power from hardware to enable better recognition, and tech giants continuing to bankroll voice technology as a new Internet entry point, the fluid voice technology will soon attract more Chinese consumers to try it out. Since it is a totally different communication universe from the current visual-based Internet, brands have to start thinking of their strategy in this area. Otherwise, we might not hear from many of them any more in the future.