Case StudyCaso de EstudioCas d'Estudi
How To Build A Google Home Voice Assistant in 7 Steps
Google Assistant App to Booking Meeting Rooms
At Opentrends we don't just write code and design for users. We are also prepared to talk to them. In fact, voice user interfaces (VUI) have revolutionized the interactions of the audience with devices. But how do you build a voice assistant?
We have created a concept for Google Home that books meeting rooms in a simple way. Next, we will share a real workflow to conceptualize and build a user interface with voice. Technology is the enabler.
Before creating a Google assistant application it is essential to meet some technical requirements:
- A Google account to access all the services and tools.
- A Google home / phone with google assistant / Emulator to test the app (You'll have a much better time with a Google Home when testing).
- Server with NodeJS where we will have part of the business logic.
With these elements, we started to build the voice assistant with Google Home. The steps that we took at Opentrends and that you can follow are the following:
- Bot voice and tone definition
- Conversational tree design
- Setup the environment
- Build with Dialogflow
- Build the server to handle the business logic (NodeJs)
- Test
- Deploy
- Co-creation to choose the goal of the voice assistant:
With the premise of making the office more intelligent, 5 Opentrends stakeholders participated in an exercise to find the best solution around this concept. Finally, we detected the need to improve the process for the booking of meeting rooms.
- Definition of the voice tone of the bot:
First, we define the voice tone of the bot. Through a quick analysis of the market, we create three possible personalities to whom we assign specific features of their speech: keywords and words crutch, intonation and rhythm. In this way, we could humanize the bot and at the same time, give consistency for evolutionary futures.
- Conversational tree design:
What questions are essential? What conversation flow is most suitable for the usability of the service? Where could it get stuck when it comes to giving adequate answers? The conversational tree foresees all points of contact between the user and the bot, as well as the answers to ill-formulated questions or even insults. In this way, we minimize the possible errors during the use of the voice assistant.
Pro Tip!
If you want to dive deeper and learn how we define the personality & voice tone of a bot or how we build conversational trees, check out this killer guide to build a chatbot that works with your brand.
In the process of analyzing the conversation flow between the user and Google Home to book a meeting room, we decided to create 2 actions: users could book a room directly or ask which meeting room will be available. The flow begins when the user wakes up the application: "Ok Google, talk to booking rooms". Dialogflow identifies it as the "welcome intent" and asks the server for the corresponding response. For its part, the Google assistant is the part that detects the voice and transcribes the voice message to text and vice versa.
When we design chatbots or VUIs, we talk about "intents" and "entities". The "intent" is the user's intention. Identifying the "intent" means finding out what the user wants when interacting with a bot. An "entity" acts as a variable that modifies an "intent".
We use DialogFlow to create the application that will receive the message and find out the response to the user. DialogFlow communicates with a NodeJs server that makes the application more intelligent: the server returns the correct message depending on the time, the previous messages and the availability of the meeting rooms.
The DialogFlow process is:
- Dialog Flow receives the text and figures out to which agent it will send it.
- Dialogflow’s agent identifies the intent of the user and passes the text to the right intent.
- Dialogflow’s intent uses entities to store parameter values.
- Dialogflow’s intent passes the request along with entities to fulfillment.
- Fulfilment uses webhook to call the server.
We created a server with NodeJS where we have the business logic. The server receives the user's message, some keywords and the action (reserve or request information). With that information and the context of the conversation, it connects to the data store and extracts the relevant data.
User testing has made
the difference when uncovering
interactions we didn't imagine first!
The final part of the project was to test and train the AI within DialogFlow. For this, we asked for the collaboration of diverse Opentrends colleagues and have them test the application. Our colleagues spent some time talking with Google Home (device, phone or test environment).
We worked in DialogFlow, which has a training section where we could access the conversation history. It was very positive to know how people talk through the interface because they expressed things that we could not imagine when we defined the flow! This allowed us to enrich and add new ways of asking to book rooms thorugh the application.