Tutorial on Voice Technology Implementation
Discovering the possibilities of voice technology, by exploring its installation process and revealing the needed code and screens for a successful setup.
Join the DZone community and get the full member experience.
Join For FreePerhaps you’ve heard about AI and Machine Learning’s popularity, as well as how these two techniques are addressing different markets as the future develops. Going further, as a part of AI and Machine Learning, there is Voice Technology, which is rising in demand. If you’ve been considering whether voice technology is worth paying attention to or if it’s just a trend, you’ll find this article useful. Even the most ardent doubters must admit that voice-based solutions are gaining popularity faster than many other advancements, which is highly questionable.
To dispel any doubts, let’s discover the statistics. According to projections, the number of voice assistants could hit 8.4 billion by 2024, surpassing the global population. Besides, Statista estimates that the worldwide voice recognition industry is expected to expand from 10.7 US$ in 2020 to 27.16 billion US$ in 2026. That being said, we can notice the huge expansion of the market, and that could trigger the majority of apps to correspond to the level and undertake an upgrade.
Surely, the cold figures are not enough to make a decision, so we are going to highlight a few valuable reasons to consider Voice technology, which are going to influence both users and your profit. So, let’s get to the point.
Major Reasons to Consider Voice Technology
Competitive Advantage
Certainly, we mentioned that the number of voice assistants will increase to impossible heights, but it doesn’t mean that making your app a part of this number will be a bad idea. If you narrow down the industry you are going to fill in with the new app and conduct research on the competitors – you’ll be very pleased to see just a few of them or even none with voice assistants’ integration. Implementing new technologies, especially those that require the application of AI and Machine Learning, will definitely create a competitive advantage over the other apps. And even if you think that there is no way how you can integrate voice recognition – think harder. Voice ordering along with voice technology could facilitate any task and any feature for the user, whether it is a Transportation Management System (here you can add the possibility to administrate parcel delivery with the voice), or an Online Tutoring Platform (here users could schedule classes, and manage audio material with their voices).
24/7 Availability for Users
Users’ orders and queries can be responded to at any moment by sophisticated voice assistants. Furthermore, because AI can so closely mimic an intellect, many repetitious jobs may be successfully mechanized. As a result, voice technology may be the most cost-effective approach for a company to increase customer happiness while also growing its customer base.
As for another benefit, if you’ll extend the time spent in your app to 24/7, you’ll get the possibility to interact with them longer, which is equal to higher user engagement and leads to more profit.
Production Efficiency
For the users, the efficiency starts right after they turn on the voice ordering/voice recognition. With that function, they can do several tasks at a time. This affects their satisfaction with the app and attracts more customers.
The other issue is that businesses are constantly seeking methods to improve their efficiency. As a result, numerous executives have begun to use voice technology for company management as well, by implementing it to the internal operations. With AI-trained voice assistants, the workflow happens faster, because multiple tasks could be conducted by the machine, with a simple order from an employee.
Customization
Voice technology offers data that can help you comprehend your target user. For example, you can look at how many times a specific type of voice search query led a person to your webpage. This provides you with information on the user’s browsing and purchasing habits. With such information, you can create better marketing strategies, and upgrade an app to fulfill the users’ demands. Which is obviously important for further app’s presence on the market range.
Now, when you ascertained what voice technology could mean to your app, it’s time to acknowledge how to apply this innovation to your idea, and how to deploy it. For your convenience, we gathered below a simple guide for you. Just keep reading!
Part 1: Installation
First of all, you need to create an Amazon account on Alexa. After that, you can go to Developer Console and start creating your first Alexa skill. To do so, you need to decide on the skill name. The skill name could be everything, however, the creation implies to make it customized and unique, so there is no plagiarism on the market. For the example, we chose ‘Incora assistant.’
The next step is to choose a model between pre-built ones and custom made. Surely, it’s your choice, but we recommend picking up a Custom option and building it through your efforts. With that, you need to decide on a method to host your skills. You can use Alexa-hosted variants for personal training. Although for production, we recommend hosting the Lambda function on AWS, so in that case, you should select the block ‘Provision on your own.’
To get started, Alexa’s interface offers to choose a template for establishing backend code and interaction model. And once again, if you agreed upon creating a unique solution, click on the ‘Start from Scratch’ option.
Finally, when the initial preparation is over, you can continue building your Voice Technology. Now, you need to figure out the Skill Invocation name. Make sure to find a phrase or a word, that will be easy to remember and spell for the user. Apparently, it could be similar to the Skill name.
When you are done, don’t forget to save and build the model after each change you made.
Hence, now you can create your own intents. An intent is an action that takes place in response to a user’s voiced request. In the sidebar go to ‘Interaction Model’ and then click on ‘Intents’. There you can create custom intents. Afterward, you should generate sample utterances. The sample utterances are a collection of plausible spoken sentences that have been aligned to the intents.
Then create a Lambda function. To do this, we use Node.js, serverless, and Alexa ask-sdk.
Let’s start writing some code. We need to create handlers for standard Alexa intents, including several paths to the file with different requests. You can find them below.
src/handlers/LaunchRequestHandler.js
const LaunchRequestHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'LaunchRequest'
},
handle (handlerInput) {
return handlerInput.responseBuilder.
speak('Welcome to Incora assistant. You can ask about technology stack, projects, and a lot more').
reprompt('What\'s your request? ').
getResponse()
}
}
module.exports = LaunchRequestHandler
src/handlers/HelpIntentHandler.js
const HelpIntentHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'IntentRequest'
&& handlerInput.requestEnvelope.request.intent.name === 'AMAZON.HelpIntent'
},
handle (handlerInput) {
return handlerInput.responseBuilder.
speak('You can say: \'alexa, hello\'').
reprompt('What\'s your request? ').
getResponse()
}
}
module.exports = HelpIntentHandler
src/handlers/FallbackHandler.js
const FallbackHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'IntentRequest'
},
handle (handlerInput) {
return handlerInput.responseBuilder.
speak('Can you repeat it, please? ').
reprompt('What\'s your request? ').
getResponse()
}
}
module.exports = FallbackHandler
src/handlers/ErrorHandler.js
const ErrorHandler = {
canHandle () {
return true
},
handle (handlerInput, error) {
console.log('ERROR HANDLED', error)
return handlerInput.responseBuilder.
speak('Sorry, I had trouble doing what you asked. Please try again. ').
reprompt( 'What\'s your request? ').
getResponse()
}
}
module.exports = ErrorHandler
src/handlers/CancelAndStopIntentHandler.js
const CancelAndStopIntentHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'IntentRequest'
&& (handlerInput.requestEnvelope.request.intent.name === 'AMAZON.CancelIntent'
|| handlerInput.requestEnvelope.request.intent.name === 'AMAZON.StopIntent')
},
handle (handlerInput) {
return handlerInput.responseBuilder.
speak('Goodbye ').
getResponse()
}
}
module.exports = CancelAndStopIntentHandler
src/handlers/SessionEndedRequestHandler.js
const SessionEndedRequestHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'SessionEndedRequest'
},
handle (handlerInput) {
// Any cleanup logic goes here.
return handlerInput.responseBuilder.getResponse()
}
}
module.exports = SessionEndedRequestHandler
Here you should add the custom handler for this intent such as ‘TechnologyStackIntentHandler
’.
src/handlers/TechnologyStackIntentHandler.js
const TechnologyStackIntentHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'IntentRequest'
&& handlerInput.requestEnvelope.request.intent.name === 'HelloWorldIntent'
},
handle (handlerInput) {
// add your own logic
// get data from some API, database, etc.
return handlerInput.responseBuilder.
speak('Our technology stack comprises JavaScript (Node, Angular, React, Ember, Vue), Python (Django), Mobile apps (React Native, Ionic).').
reprompt( 'What\'s your request? ').
getResponse()
}
}
module.exports = TechnologyStackIntentHandler
For the next step, you need to have or create an AWS account. Set AWS credentials on your local machine. Configure serverless.yml
file and run yarn deploy or npm run deploy.
serverless.yml
service: alexa-lambda-example
plugins:
- serverless-pseudo-parameters
- serverless-iam-roles-per-function
provider:
name: aws
runtime: nodejs14.x
region: us-east-1
stage: prod
functions:
info:
handler: src/index.handler
events:
- alexaSkill: amzn1.ask.skill.XXXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX
After that, you can go to the AWS lambda function and copy ARN.
Go to the Alexa developer console and set your AWS lambda ARN as an endpoint.
Finally, your skill is built, and you can try to test it using the Alexa developer console or any Alexa device. Don’t forget to switch to development mode in ‘Test’ inside the navigation bar.
To test and provide assurance of the code, we deployed it to the developer console.
After the testing stage, at long last, you get yourself a voice assistant. The next step would be just to integrate it into the hardware system.
Part 2: AWS Cognito and Alexa Skill Account Linking
Since in the first part we’ve highlighted the process of the initial installation of Alexa skill and the development of the custom Voice technology, here we are going to focus on the next stage needed – Account linking. These operations are required for user recognition, so your voice assistant would identify the person who is entering the system. So, account linking should help in building a user’s background for more personalized interactions and offer customized solutions. Simply said, by means of account linking, your voice technology collects all the data about users, and applies those information to reduce bothering questions for profile identification. This will be done automatically. However, let’s get down to this concept in depth.
What is Account Linking Regarding Alexa Skill?
Certain personalized skills necessitate the option of adding the user’s identity to the other’s system user. The purpose is to establish a relationship between the Alexa user and your system’s user account. Account linking allows you to securely authenticate users with services using your skill. Once the skill and external user account have been connected, the skill can perform tasks from the user with that account.
Account linking is conducted with the integration of OAuth 2.0. OAuth 2.0 is an open protocol that enables online, mobile, and desktop programs to request user authorization from remote services in a standard-compliant way. You may create your own OAuth server and identity management solution from scratch. But, to achieve the same result, you can utilize AWS Cognito which will assist you in developing a custom identity recognition. AWS Cognito uses User Pools, which are scalable user registry that can handle millions of members. User Pools is a fully managed service that is simple to set up without the need to worry about setting up server infrastructure, and it uses technical standards such as OAuth 2.0 to interrelate with your backend. Thus, below we will describe the process of Account Linking setup with the help of AWS Cognito since it will result in a custom-made service. But first, let’s discover the advantages.
Advantages of Account Linking
Account linking is a preferable method for those who want to develop a user-friendly solution, convenient for the target. But in detail, what are those benefits, that could convince you to establish Account linking for your Alexa skill?
- Users can omit the creation of new profiles for each platform separately, which decreases configuration time and increase users engagement.
- Account linking allows registered users to keep employing their old account while using a new social or passwordless access.
- It is convenient for users who enrolled without a password to join their profile to one with more details.
- Apparently, your application could integrate the user recognition, which will gather data from the other sources and fill in the missed gaps on your system’s profile.
- It allows your software applications to access profile page details kept across several sessions.
Considering the advantages listed above, now you might question how to generate Account linking for your voice technology application, and adjust it to your needs. To learn more about that practical part of the process, follow further instructions.
Getting Started
First of all, you need to go into Alexa Developer Console select your skill, click Tools and then Account Linking in your left sidebar.
The first phase is to configure the users’ access and action within account settings. We recommend allowing users to create an account or link to an existing account with you and enabling skill without account linking. Under the block Securite Provider Information you should select Auth Code Grant as an authorization grant type. Exactly this type is used so that the user could obtain access tokens from the authorization code.
User Pool’s Setup
For the next step, you need to have an existing AWS Cognito User Pool or create a new one and set it up. To do so, move onto the AWS Cognito page. There we will set up User Pools in order to get the client id and secret id, which will be needed for the further configuration on the Alexa Developer Console.
Step 1: Sign-in Options
At the first step of User Pool’s Configuration, you should choose the way how the users will be able to sign in. Between the options Username, Email, and Phone Number, we choose Email and go on to Step 2.
Step 2: Security
Here you need to decide on the ways how to make the user experience with account linking more securable. For that purpose, you need to choose the password requirements and authentification method. For illustrative purposes, there is no need to write a custom policy, so we select the defaulted parameters. The same is with multi-factor authentication, there is no need for us to choose MFA, so we preferred the No MFA option. However, when developing your own solution, you can decide which measurements to set, and define customized requirements.
At this stage, you also need to configure the possible solutions for the users, when they forget their passwords. We definitely recommend enabling self-service account recovery, since it would be a common issue for each platform. And as a delivery method for the account recovery, we choose Email.
Step 3-4: Sign-up Options and Message Delivery
These steps are completely optional, and won’t affect the Account linking. Hence, we omit the instructions on them. You can also skip them as we did, or set up complying with your needs.
Step 5: App Integration
Let’s create the User Pool Name. As this is the example, we selected ‘Test User Pool’. Then we enable to use the AWS Cognito Hosted UI, since here is the part important for the following process we highlight in this article, namely AWS Cognito and Alexa Skill Account linking.
Then we configure a domain for the endpoints. Below we provided the screenshots to demonstrate this phase of the setup.
Now you need to copy callback URLs from Alexa Developer Console and insert them into the provided fields.
After all of the passed steps, you should set the sign-out URL. Since on the screenshot below, there is not the full version, here is the template we outlined for you:
- https://{YOUR_DOMAIN}.auth.us-east-1.amazoncognito.com/logout?response_type=code
Step 6: Configurations Approvement
Now you just need to submit all the changes you made and move on to the next stage of the Account linking process.
Auth Code Grant Layout
All the steps above lead to the formation of the client id and secret id. So, you copy that data from the tab ‘Test User Pool → App client: Alexa client’.
Then, you should return to Alexa Developer Console and finish the initial configurations.
Here are the templates of the full URIs:
- Web Authorization URI: https://{YOUR_DOMAIN}.auth.us-east-2.amazoncognito.com/oauth2/authorize?response_type=code&redirect_uri=https://pitangui.amazon.com/api/skill/link/{CODE}
- Access Token URI: https://{YOUR_DOMAIN}.auth.us-east-2.amazoncognito.com/oauth2/token
For the last step, you need to create and connect AccountLinkingHandler
in your Lambda function.
src/handlers/AccountLinkingHandler.js
const AccountLinkingHandler = {
canHandle(handlerInput) {
return !(
handlerInput.requestEnvelope.session.user &&
handlerInput.requestEnvelope.session.user.accessToken
);
},
handle(handlerInput) {
return handlerInput.responseBuilder
.speak('Please link your account to the Incora assitance skill using the card that I have sent to the Alexa app. ')
.withLinkAccountCard()
.getResponse();
},
};
module.exports = AccountLinkingHandler;
Finally, you can use userAccessToken
for getting any information from your system for personalization. An example of how to obtain accessToken can be found below.
const TechnologyStackIntentHandler = {
canHandle (handlerInput) {
return handlerInput.requestEnvelope.request.type === 'IntentRequest'
&& handlerInput.requestEnvelope.request.intent.name === 'HelloWorldIntent'
},
handle (handlerInput) {
// add your own logic
// const userAccessToken = handlerInput.requestEnvelope.session.user.accessToken
// now you can use userAccessToken for getting any information from your system for personalization
// get data from some API, database, etc.
return handlerInput.responseBuilder.
speak('Our technology stack comprises JavaScript (Node, Angular, React, Ember, Vue), Python (Django), Mobile apps (React Native, Ionic).').
reprompt( 'What\'s your request? ').
getResponse()
}
}
module.exports = TechnologyStackIntentHandler
The account linking is already successfully adjusted and ready to be used! Now, your voice assistant is much closer to user satisfaction.
Concluding
Voice technology is the future of each industry. The potential of voice recognition will only grow as AI and Machine Learning advance, bringing usefulness to the market. Voice technology introduces a completely new manner of communicating with clients and enhances their engagement.
Published at DZone with permission of Tetiana Stoyko. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments