Hannes Heikinheimo
Sep 19, 2023
1 min read
Voice search or voice-enabled search is the means of searching by using the most natural input channel, human speech. Voice search can refer to Google searches performed by using voice, but in this article, we'll investigate how voice search could be used in touch screen applications and websites.
Here's an example of voice search, just to get it out there as quickly as possible. You can try it out here
Speechly can be used to create real-time voice user interfaces that give visual feedback to the end-user as they speak. This visual feedback enables user to recover from errors (saying something like "Show me blue, sorry I mean red t-shirts") and it encourages them to go on with the voice experience as they see their commands are being correctly understood.
In case you are interested in only seeing the demo, click here and try out our voice search demo. Then you can get back to the article. There's also a video demo in the bottom of this article.
20% of all Google App searches are done by using voice and a nearly 40% of US internet users are using voice search features. Most of these searches are currently performed on smart speakers and voice assistants.
This is why, for many, voice search only or primarily refers to search engine optimization for voice queries. This is a valid use case for voice, but not the only one. Voice could (and should) also be available on all websites and applications.
According to PwC research from 2018, consumers actually prefer voice over typing for searching. And this is no wonder: typing on a smart phone is not the nicest of tasks and more and more people are using internet primarily on a mobile device.
When it comes to applications and websites, voice search is still rarely available. Why is that?
We did some small research on developer communities and found out that the number one reason for not adopting voice search is the lack of developer tools for doing that.
The lack of developer tools has led to most of the voice commerce still happening on smart speakers and voice assistants. In fact, 20% of consumers in the US who own a smart speaker have already purhcased something by using it. Amazon website, by far the largest US eCommerce platform doesn't support voice currently at all.
But using voice only on a smart speaker is like purchasing something over phone. While totally feasible, it's not the best way of purchasing clothes, for example, as you can't see what you are purchasing. Voice is a good modality for inputting information, but it lacks as a output channel.
By integrating voice to current websites and applications, eCommerce companies could tap into the growing voice market but still have full ownership to their own data and own the customer experience.
Search is a very valuable part of many applications and designers and developer teams use a lot of time to make it just right for the users. Users who search can be over three times more probable to convert than those who don't search. And search is not only limited to eCommerce, of course. Most applications have some kind of a search functionality.
Developer tools for building voice search haven't previously been available
This means getting users to search and most importantly to find what they are looking for is of immense value to most product teams. Even if your application offers something that the customer needs, if they don't find it, you are losing business.
Smart speakers can help here, too: according to ComScore, half of online shoppers are using voice assistants to help them research products.
But like we've written many times, smart speakers and voice assistants are not the holy grail of voice. Actually far from it. Voice needs to be a part of applications to be really relevant.
By adding voice search to your application, you get both of the best worlds; you can tap into the growing voice search market and improve customer experience but your users can still see the product photos you've spent so much time optimizing for and other important information.
It's important to note that voice search is not just regular search box that has a microphone button and a speech-to-text system. This approach would lead into a bad user experience and this is not how Google is doing it, either.
In fact, the reason why a lot of searches on Google are done by using voice is that they have developed their natural language understanding capabilities. If you are looking for population of the USA, you can just ask "What's the population of United States" and you'll get your answer.
We search differently by using voice and by typing. For example, voice searches are typically longer than their written counterparts and they are expressed in natural language.
This has clear implications for voice search. While typing "Intel LGA1151 motherboards" to a search bar is a natural way for doing search by typing, uttering that out loud might prove difficult. Probably saying something like "motherboards for new intel processors" might be a better way of expressing the same search by using voice.
While regular search can give valuable information on how the voice search should be implemented, it needs to be designed separately.
Compared to the traditional search options, search filters and text queries, voice has a few key benefits.
One major UI problem with graphical user interfaces is that there needs to be a button for each functionality. If you want your users to filter by categories, you'll need to have each of the categories somehow visible in the user interface. Because of this, you have to decide the most important categories and most importantly, names for those.
For instance, if you are selling clothes, you need to decide whether your users find sneakers from category "shoes", "running shoes", "footwear" or "sneakers". Of course you can do subcategories, too, but that still leads into a lot of clicking and searching through many options.
Search doesn't have this drawback. You can define any nunber of synonyms for your categories and product names. Your users can access all these categories from all screens of your applications without diving deep in menus.
Voice is also a lot faster than typing. According to Stanford research, users can speak up to 3 times faster than they can type on a mobile phone.
Because it's easier and faster, voice can make users buy more. Based on our experience with SOK, filling a typical grocery shopping cart can be up to three times faster and hence cart sizes are typically larger than carts created in traditional style.
One of the most important reasons for not shopping by using voice is privacy. Consumers are more and more conscious about privacy and they don't necessarily even want a device listening them at home.
By adding voice to your application and leveraging push-to-talk rather than wake words that can be triggered accidentally, you can limit the privacy implications.
If the user is always fully aware when and if they are being listened to, privacy is not such an issue.
To get voice search right, it needs to support natural language. The experience must be natural and it should be as tightly integrated to the current user experience as possible. You don't want to add a voice assistant or a chatbot with a totally different user experience, but to complement your current search.
Speechly is the first developer tool that enables developers to add natural, real-time voice experience to their applications in a short amount of time.
Voice search done by using Speechly works on all platforms with a single configuration for consistent user experience, enables corrections by using speech and is easily integrated to the current user interface with minimal changes.
Here's a demo showing voice search done right. It supports natural language, long or short queries and updates in real-time to encourage user to go on with the voice experience or to enable them to correct themselves quickly.
Speechly supports iOS, Android, React and React Native platforms. You can find tutorials for each platform in our documentation.
If you are interested in adding voice search to your website or mobile application, contact us to learn more!
Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.
Hannes Heikinheimo
Sep 19, 2023
1 min read
Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.
Matt Durgavich
Jul 06, 2023
5 min read
Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.
Markus Lång
Jun 01, 2023
1 min read