Even with its impressive growth and user adoption, is voice technology living up to its hype?
Every decade or so, a new wave of technological innovation bridges the gap to mass adoption and changes how we interact with the world through technology. The world wide web was the first wave, mobile came second, and voice-activated technology is now the new innovation shaping our daily lives. If you’re like me, you literally start your day with “Alexa, turn on lights,” possibly followed by “Alexa, what’s the weather,” or “Alexa, give me the daily news.” More impressive, my devices are just a few amongst the 100-million Alexa devices that have been sold, a number that doesn’t include other popular voice-activated devices made by Apple, Samsung and Google. The numbers are certainly impressive, but what does all the growth in voice technology really mean for consumers and the future of voice?
Although voicetechnology has been around for decades, it was not until the launch of voice assistants on smart speaker devices that voice technology found a consumer-ready vehicle to drive adoption. I’ve been a part of the so-called “wave” for the last 2 years, first as a consumer and then as a developer/delivery manager for a voice-experience startup. To date, I’ve worked on more than 1000 voice experiences for Amazon Alexa and Google Home. During that time, I’ve become acutely aware of the various pros/cons of the different assistants, what makes a good voice experience, and how companies of various types should approach this nascent technology that changes and improves almost daily.

Before diving any further, I want to specifically delineate between voice-assistants, smart speakers and voice-technology. A voice assistant is something that makes our lives easier by handling mundane, simple tasks. It is a subset of voice-technology that uses voice recognition, speech synthesis, and natural language processing to make our lives easier by turning on lights, playing music, setting alarms or answering basic questions. A smart speaker is a hardware device that is powered by a voice assistant. You can think of smart speakers as being a vehicle for a voice assistant, similar to how iPhones enable the usage of Siri. Voice technology is an umbrella term applied when users interact with systems via their voice. Voice assistants and smart speakers are apart of the voice technology revolution and are often used interchangeably given that smart speakers are powered by voice assistants.
Now that we’re clear on terminology, let’s examine a few common usage cases for voice assistants and their smart speaker counterparts.
Answering Questions
Before voice-assistants came packaged in smart speakers, there were the voice assistants themselves: the Siris and Google Assistants that carried out simple commands such as answering questions, making phone calls, setting alarms and scheduling reminders. It’s that first task — answering questions — that may have the greatest long-term potential for voice assistants. As an example, at least 20% of all mobile search queries on Google are done by voice today and, according to ComScore, 50% of all searches will be done by voice by 2020.

There is certainly something to be gained for both consumers and companies in being able to ask “What time does Home Depot close?” getting an answer and then, going a step further, seeing if they have the right light bulb you need in stock. The numbers to date, and future forecasts, prove that consumers are getting more comfortable with posting questions to voice-assistant devices. Another example of this trend is illustrated by the United Kingdom’s recent example of creating a voice application able to answer over 12,000 commonly asked questions about the government. If your audience is asking questions about your company, product, content or anything in between, you want to make sure the voice assistants are handling them gracefully or you might be missing out on a value-add touchpoint with your customer. At this moment, Alexa was unable to tell me if any light bulbs were in stock at Home Depot and even gave a non-contextual response to my question. While my experience with Home Depot was less than impressive, the graph above clearly demonstrates the potential for improvement in voice search capabilities in just 6 months.
Listening to Audio Content
Similar to a radio station, podcast player, link on a website, or TuneIn, voice-assistants are a new distribution channel for audio-content. Although far from perfect, and definitely not a fit for all audio-content, voice-assistants are making an impact within the music, radio, podcasting and associated broadcasting industries. Of the 1000+ skills I’ve worked on, more than 90% were radio/podcast experiences. Based on my experience in the industry, I believe that not every piece of audio-content needs to be distributed on these devices.
A radio station in rural Alaska, where residents have metered internet, is not likely to perform well due to numerous factors, i.e. How many devices exist in the target market? Is the demographic of the target market likely to own a smart speaker? I often explain to my customers that this is not a “build it and they will come” scenario, and that both discovery and invoking content on smart-speakers is far from seamless; however, for every radio station in Alaska that has ~100 users/month, there are stations and podcasts that contribute roughly 15% of monthly listening to these devices, with a forecast for growth. Ultimately, there is no silver bullet or playbook to guarantee success, but, for the right audio content, voice-assistants are a valuable distribution channel and worth paying attention to.
Making Purchases
Given Alexa’s Amazon affiliation, it seems natural that smart speakers should be a conduit to drive voice purchases and e-commerce. Similarly, Walmart and Google partnered to offer e-commerce capabilities in their ecosystem. It seems logical that a voice assistant can be of use when a user runs out of milk in the morning and is able to ask his/her device to re-order in time for delivery the next day. Outside of traditional e-commerce, several voice assistants offer easy voice payment options and have a lot of potential for both physical and digital purchases, e.g., subscriptions to gated content on the devices; however, no success story really stands out at this moment.
Until recently, some analysts predicted that voice-based commerce would eventually be responsible for tens of billions of dollars of retail sales. Those predictions are likely to change following a 2018 internally leaked document from Amazon that indicated just 2% of the people who own Amazon Alexa-powered devices have made purchases using voice. More importantly, 90% of that tiny minority failed to make a second voice-based purchase.
Everything stated above begs one question: Has the potential for voice assistants and smart speakers to impact our lives been dramatically overstated?
Your answer to this question might mean betting against the likes of Samsung, Apple, Google, and Amazon. As both a developer and user of these technologies, I have no doubt that there is a place for voice assistants in our daily lives. However, user activity and preferences are still determining for what. According to Amazon, many of the top Alexa skills are games and those that aren’t are “focused on daily habits, wellness, and…family fun.” Even more interesting is the observation from TechCrunch’s Sarah Perez who noticed that many of Alexa’s top experiences “are known app names from the mobile app ecosystem, rather than breakout hits that are unique to Alexa or smart speakers.” While the data uncovers some issues with voice capabilities and adoption, the data is not dispositive of whether adoption and growth of voice technology will wane or whether such technology can be overlooked by companies. At this moment in time, it’s hard to find a compelling ROI case for voice-related development that isn’t experimental in nature, but consumer acceptance of voice-activated technology indicates that experimentation should continue to find the right experiment beyond simple assistant-like activities.
Regardless of the mentioned shortcomings in capabilities and more advanced use cases, there are tech analysts who will tell you that voice interfaces will revolutionize technology and become the dominate user-interface of the future due to its ease of use. This prediction is based on the larger concept of voice-driven interfaces enabling interactions with systems solely through voice. If you go back to some of the examples in this article, checking for a lightbulb at Home Depot or ordering milk for the next day could be exponentially easier via voice than the current alternatives. Voice interfaces are still in the embryonic stages of their development and will continue to be developed and integrated with modern technology at an innovative rate of speed and will likely include visual and touch components; however, at this moment in time, voice assistants and their associated smart speakers are still looking for a niche when it comes to non-assistant like experiences. For this user, turning on my lights, setting alarms and listening to the news/music/podcasts is all I really need at this moment. As technological development continues to advance and evolve, so will the use cases for voice-activated technology beyond those mentioned in this article.