Developing Conversational Assistant Apps Using Actions on Google (GDD Europe ’17)

Developing Conversational Assistant Apps Using Actions on Google (GDD Europe ’17)


[MUSIC PLAYING] IDO GREEN: Hello, and
welcome, everyone. Thank you for joining us. We are excited to be
here, and we really appreciate the
effort, especially at that hour after lunch. We know it’s not as easy
as the early morning hours. We’re here today
to share with you what’s new and cool and
interesting in the new platform of the Google Assistant. At Google, as we heard
in the keynote earlier, we believe that the
future is AI first. And what does it mean? It means that we invested
a lot in machine learning, natural language processing,
and understanding context and what the user want. And it’s all boiled down and
coming into that assistant. It’s basically a way for you to
have a conversation with Google and get things done. And the beauty of
this new platform is that basically
it’s everywhere. So from really small
devices up to cars. And very soon we’re going
to have many more devices. It’s going to be everywhere. Why? It’s one of the most
important question that every time when
we have a new platform or a new system we ask ourself. And there are obvious
questions and obvious answers to this new platform, like it’s
write once, run everywhere. It’s very efficient. It is a great way
to get things done. But to me personally,
there are two things that I think if I
were in your shoes, I would look at
it very carefully. One is the productivity aspect. You could really help your
users interact with your service and get to the result
that they want efficiently and without too much trouble. And second, like the
photo up there, it’s a blue ocean right now, so
think about the early days in each big new revolution. You could be one
of the first there. And we’ll touch a
bit on it later. It’s a great way for you to
be there with a quality app and then reach more users. So what does it mean
for us as developers? There are a lot of new terms. And it’s great to understand
at the beginning what we mean with each and every component. So first and foremost,
we have the surfaces. Right now we have the
Google Home device, which is a smart speaker
that the Assistant is embedded in it. And it lets you talk and
get things done with it. And of course, mobile
devices, Android, iPhones– it’s embedded there. The Google Assistant is
basically the conversation that you are doing with Google. And like we saw in
the impressive demos in the keynote, lots of
new interesting things to communicate and to
express what are the intent and what we want to do and get
the results very efficiently and quickly back to you. Action on Google is the way
to expose this new Assistant platform to
third-party developers, to you, so you could tap into
the investment that we did and leverage it
when you are coming to serve your
services and product and you hand them to your users. The major important
aspect here to remember is that when you
interact with the system, it’s very straightforward. It’s a conversation. And when you want to handle
your service and your users, you basically burst with OK,
Google, or hey, Google, talk to, and then you’ll give it
your specific brand name or app so your users from that on
will move and be in your hands, and you could interact
with them and do– and help them reach the
goal that you want to. So how does the
Assistant app work? And again, it’s very
important the beginning to have the basic understanding
of the moving parts here. And then everything
become much clearer. So as you can see
here, in our example, we did a small demo app that
is called Personal Chef that give you suggestion to a
recipe based on your flavor and what you feel like
in the right moment. And when you burst and say, OK,
Google, talk to Personal Chef, basically you’re going
to Actions on Google and then the magic of
language understanding and gathering out from
what the user said their exact intents and
entities is being done for you. And from that moment, you
basically go to your service– here it will be Google Cloud,
but it could be any cloud service out there– and the user will
be in your hand, so you’ll need to interpret
what they wanted to and return back the text
that, again, Actions on Google will take and produce
a speech out of it. So all the interaction
with the users will be done through that layer
that’s basically giving you the ability to work
with nice JSON file in your server that is easy
to interpret and work with. But the heavy lifting
of combining and going from text to speech is being
done through that layer. I think the best will be just
to look at this show demo. It will make very
clear what I just said. [VIDEO PLAYBACK] – OK, Google, let me
talk to Personal Chef. – Sure. Here’s Personal Chef. [BEEP] – Hi, I’m your Personal Chef. What are you in the mood for? – Well, it’s kind
of cold outside, so I’d like something to
warm me up, like a hot soup. And I want it fast. – All right. What protein would
you like to use? – I have some chicken and
also some canned tomatoes. – OK, well, I think you should
try the chicken tomato soup recipe I found on example.com. [BEEP] – Sounds good to me. [END PLAYBACK] DANIEL IMRIE-SITUNAYAKE:
All right, so that was a pretty
simple interaction. It was like a simple thing to
be having a conversation about. But the conversation itself
was actually quite complex. So Wayne, our developer
advocate out there, said cold, warm, and hot
all in the same sentence, but the app managed to capture
the correct one referring to the soup. So can you imagine
trying to write a regular expression or
a parser that would try to extract meaning out of that? It basically– there are
so many difficult cases that it ends up being pretty
much impossible for anything beyond a trivial example. So let’s look at
some of the ways we could build this interaction. So one of the options is
to use the conversation API and the actions SDK. And in this case,
your Assistant app receives a request that
contains the spoken text from the user as a string. So Google handles the
speech recognition for you, but you pass the strings,
you generate a response, and then Google handles
speaking this back to the user. The thing is, as
we just mentioned, parsing natural language
can be really difficult. So fortunately,
we have some tools that make handling that
type of thing a lot easier. API.ai is one of these. It’s a platform that makes
it incredibly straightforward to build conversational
experiences. You might not even have to write
any code if you’re building something fairly simple. So we’re going to give you
an overview of this today. And it’s probably
what most of you should use if you’re
going to implement your own assistant app. So API.ai basically provides
an intuitive graphical user interface to create
conversational experiences. So you program in a few example
sentences of the way users might express a certain
need, and you can also specify the values you
need to get from the user. And then we use machine
learning to train a model, or we train a machine
learning model, to understand these sentences
and manage a conversation. So the key part here
is that you no longer need to process raw strings. API.ai will do that for you. So you can see in our
diagram where API fits in. It handles the
conversation fulfillment in between the Assistant
itself and your back end. And API.ai handles the
conversation for you, so another way of
looking at this is like, the user says something,
so maybe they’re asking for a certain recipe. The Assistant converts
their voice into text, and API.ai receives
that text string. It will decompose that, figure
out what it actually means, and hand you that meaning in
the form of structured data. You receive that
in your web hook, you do whatever you need to do,
like look it up in the database and find a matching recipe,
you build a response, and then you pass it back
to the Assistant, which will read it out to the user. So we’re going to show you a
short demo of how we would work with API.ai to build this app. All right, so in
API.ai, we create an intent to
represent each thing that the user might want to do. So in this case,
we’ve built an intent that covers the user
asking for a recipe. So I’m going to open
the intent here. So right here you
can see how we’ve provided examples
of different ways the user might express
their desire for a recipe. And these examples
are used to train a machine learning
model that can recognize what the user wants. And as we add examples,
API.ai will automatically pick out important concepts
that are mentioned by the user. So the system actually
understands many concepts by itself, but you
can add custom, domain-specific information,
like in this case, recipe ingredients. So I’m going to add a
couple of things here. IDO GREEN: And as
Dan types, you could see that it will
mark immediately the entities that
it understands. And actually, if it doesn’t
understand correctly the entity that you wish,
you could mark it for it. So you can see here, protein
and dish type immediately have been recognized
because we already have some examples beneath. DANIEL IMRIE-SITUNAYAKE:
So another example. So these are all entities
our system knows about, and API.ai is able to pluck
those from the user’s statement and figure out what they mean. So we can actually also mark
this information as required. So here we’ve put in a
cooking speed and a dish type. Down here, we’ve actually
marked dish type and protein and vegetable as required. So in this case, our
app is automatically going to ask the user if
they didn’t mention it. So I’m going to save
the intent, and we’ll wait one moment for it to train
the machine learning model. OK, so training’s completed. So now I’m going to enter
in our test console, which is kind of how we work while
we’re developing the agent. I can just enter in, I
want something with beef. So in this case I’m
specifying a protein, but I didn’t specify a
vegetable or a dish type. So the system actually knows
that it expects those things, and it should prompt for them
if they weren’t supplied. So I can now say potato. And I still need to know a
dish type, so what kind of dish do you prefer? Let’s have a main course. And so now this is where we
would hand off to our back end, and we would use
that structured data to look up a recipe
in the database and return it to the user. So it’s pretty amazing to be
able to conduct a conversation dynamically in this
way, just based on the information the
user provided, on the fly, without knowing in advance
what they’re going to say. Once the intent has captured
all of this information, it becomes available to
use on your back end, which is where you’re going to make
stuff happen and generate a response. So you can see in here, we have
the action that we resolve to, which is the intent, and we
have this parameter data that we captured from the user’s query. And so because we have those
values in a structured way, we can look those
up in our back end and return something useful. So once we’re done,
it’s only a few clicks to integrate the app with a
load of different platforms, so on the integration tab here. So we want to integrate
with Actions on Google so that this app is available
via the Google Home. So we can just click
here and load up the simulator to test this out. So the Actions on
Google web simulator allows us to test out
our action as if it’s running on the actual platform. So I can say talk
to my test app. COMPUTER: All right. Let’s get the test
version of my test– DANIEL IMRIE-SITUNAYAKE:
And so here we’re seeing how the app
would be responding. And I can say let’s try
a recipe with chicken. And this is now communicating
with API.ai in the same way that the Google Home might be. So it’s asking what
kind of vegetables we have, so let’s say potato. And again, we want
a main course. So we can see how it’s
really easy to develop your app with an API.ai,
test it out in the simulator, and then the magical
thing is if you’re signed in to a Google
Home or the Assistant app on your phone, this app is now
immediately available for you to test out. So you can try it straight away
and make modifications live, which is a really fun workflow. There’s not really
any deployment step. So this is the workflow
for building an Action. It’s super fast and simple,
and you should definitely attend the following
workshop session on building an assistant app. So give it a try
it for yourself. So let’s switch back
over to our slides and talk a little more
about the platform. IDO GREEN: So we saw how
we could leverage API.ai, and the goal with any
app that we’re building is actually to
have the wow effect and to have the best experience
that we could give our users. So that should be it
about what can we do. Before we’re starting
to code our app, and it’s actually
quite easy to do, I think one of the
most important aspects is to dive into all
the great content that our team produced
around design voice UIs. It’s a totally different topic
than what we, or most of us, are used to in terms of
graphical user interface. And when coming
to voice, there’s lots of constraints,
limitations that we could take into consideration. And there’s some great
opportunities there, like a checklist and tips
that could really put you on the right track
when you’re coming to design the conversation
and to see how those moving parts are going to work best
in your specific use case. When we’re coming
to build it, we need to take into
consideration that, like we saw in the earlier
slide, we’ll have different services out there. Right now, we have Google
Home and mobile devices, and soon we’ll have more. And on mobile devices, we
could and actually should leverage the real estate
that the screen is giving us, as as you can see
here in the example, let’s say that the user
wanted to get to someplace. On Google Home, we’ll
tell them where to go and if we have the screen
and we know it in our app that we are now on this
surface, we could give them a small map and a
URL so they could get the directions immediately. At most basic, you can specify
on the screen what is the text that you are going to speak? And as you can see
here in the example, we have what we’ll be
telling to the user and what will be the text. One of the most important
aspects to remember here is that we always want
that the text will be a summary or the
executive summary of what we will talk to the user. So if the user is
listening, they’re getting the full-blown
answer, and if they’re just skimming with their
eyes the screen, they still get to
see what we want or the most important aspect,
but like in the example here, we don’t need to give
them the full-blown answer. When we have one of
the options or one of the most popular options,
one of the main design tips that we’re giving is to
lead the conversation and to give the user the ability
to very quickly go en route to the path that they want to. So suggestion chips are one
of the most efficient ways to do it. If you have some
popular choices, you could suggest
it to the user. So on the screen, they
have this nice button that they could click on. And if of course we
are on Google Home, you could suggest it,
and the user will choose. They don’t necessarily
force to it, right? They could choose, in this
example, any other number that is out there. But at least we suggest
to them what to do. And in our example,
it could be suggesting the most common vegetables that
are appearing in the recipe. Basic cards are actually
quite holistic and complex, in the sense that they are
giving the full-blown image, URL, and text, and its
giving you the ability to extend the experience. So in our case, we could show
a nice big photo of the dish itself and to give
the user a link so they could open
it in their app and see what we
are going to cook. There are many
cases where you want to set and show the
user a list of things and give the user a
visual selection of what’s going on with the
different type of things that they could choose from. And the carousels
show big images. The main difference
is that we have a more limited set of items. And with lists, we’ll have
smaller images and much longer lists that we could
show the user what are the different options. And in this example,
you could think about showing a couple
of different options for a type of dish. And that will be– let’s
say that the user already chose a chicken salad. We could use the
carousel to offer three, four different types
of recipes for chicken salad. On the other hand, if we want
to show a dozen different dishes that feature chicken
as an ingredient, we could use the list. DANIEL IMRIE-SITUNAYAKE:
So you might have a conversation where you
need to know the user’s name or location. So one example might be if
you were helping the user find a local bookstore,
and you want to know the zip code or the postcode. So you can use our SDK
to request permissions for the name, the
coarse location, and the precise
location of the user. And when you invoke
this function, the Assistant’s going to
ask the user for permission in the voice of your app. So it’s really seamless. The name is the name of the user
who’s registered to the device. The precise location is
the exact GPS coordinates and the street address. And then the coarse location is
just the zip code or postcode and the city. If you’d like to link a
user with their account on your own service and
you have an OAuth server, the Google Assistant can prompt
users to link their account. So at this point,
the requesting user is going to receive a card at
the top of their Google Home app on their phone that provides
a link to your login page. And they can follow that,
log into your service. Once the user’s completed
the account linking flow on your web app, they
can invoke your Action, and your Action can authenticate
calls to your services through our API. So it’s important that you
provide the OAuth end point. Right now, as part of
the approvals process, we basically want you to
run your own OAuth server. So if your experience
involves shopping or payments, we support rich
transactions, and we allow you to accept user payments. And the really cool
part about this is that customers can use
whatever payment information they have on file with Google. So payments can be super easy. There’s no need to fumble
around with a credit card or read numbers out
loud, although they can use your payment
solution if you prefer. Transaction supports
are shopping carts, delivery options,
order summary, and payments. The user can see a history
of all their transactions. The Assistant also
supports home automation via our Smart Home integration. So if you’re a device
maker, you can easily integrate your existing
devices with our Home Graph. And the Home Graph
basically knows the state of all connected
devices, so that when you’re asked to dim the
lights a little bit, it knows how to do
that intelligently. So there’s kind of
endless possibilities around how you can allow
the user to interact with home automation. We’ve also announced the
Google Assistant SDK, which enables you to embed the
Assistant in your own custom hardware projects. So “MagPi” magazine announced
the AIY Projects kit from Google, which
provides a cardboard housing with a button, a
speaker, and a microphone, and it wraps around
a Raspberry Pi and uses the Google
Assistant SDK. And at Google I/O, we
demoed this mocktails mixer from our partner Deeplocal,
which embeds the Assistant SDK, and you can walk
up to the device, tell it what kind
of drink you want, and it will mix it up for you. So you can embed the Assistant
into pretty much any hardware device. IDO GREEN: After we’ve
built our Assistant up, it’s time to reach user and
see how we could drive traffic to it and what are there
different options for us at the moment. So the basic way to
invoke the app is to– after, of course,
you’ve submitted the app and it passed the review– is to provide a set of triggers
that the invocation knows. So when you’ll say, OK,
Google, talk to, in our case, Personal Chef, it could
be another one or two different invocations,
and the user will ask according to your
brand name or to your app name, and it will invoke it. We also support deep linking. So as you can see
here in the example, you could say in one sentence
to talk to Personal Chef and get the recipe of the
hot soup or something else. There are also the
ability to ask Google for I want to work out. I want a yoga or something
more [INAUDIBLE].. And then obviously, if you
have a quality good assistant app in the directory,
Google will surface it. And that’s come to
the point that it’s a blue ocean and great
opportunities out there. Last but not least, we
have a full directory of apps that are sitting and
living inside the Assistant. On the top right corner,
you have the ability to click and open the directory. And again, it’s opening and give
the user based on categories and based on what
the system thinks the user will enjoy most. What are the
different variations of apps that are out
there to try and use? There are quite a lot. So I really suggest you
to check some of them out. You could and should link. And we have a deep
link so you could, if you have a web
service or another app, you could always link from
it to the Assistant app. So in lots of cases
where the service might be more efficient for your
users from the Assistant app, I highly encourage
you to link to it. And then of course,
they could use it. DANIEL IMRIE-SITUNAYAKE:
So we’ve seen all this awesome stuff
you can do with the Actions platform, but let’s
talk a little bit about how you can get started. So we’ve got a series of videos
that cover a lot of content on how to get up and
running with the Assistant. So you’ve got an
intro video that explains high-level concepts and
goes through the Personal Chef example. There’s a video dedicated to
conversation design, which is definitely worth
checking out as you start to design and build your app. And we’ve also got a screencast
of every single step needed to build our Personal
Chef example. We also have several
code labs that will guide you through the
experience of designing and implementing your
first assistant app. And finally, when you want to
discuss this stuff with other developers, we have a really
active Actions on Google Developer Community on Google+. So we post regularly
there to keep you up to date with the latest
news, and we answer questions from developers. We’ve also got a great
community on Stack Overflow if you have technical questions. After us, there’s really great
talk by Sachit and Shuyang that will guide you in
much more detail in how to build an assistant app. So you should definitely head
upstairs and check that out after this talk. And our whole team of assistant
dev rail people is here, so please feel free to
come by and ask questions about the platform. We can show you the Google
Home, and we can show you the Assistant SDK stuff. So thank you all for coming and
really encourage you to dive into this new platform. [APPLAUSE] IDO GREEN: Last but
not least, I forgot to mention it at the beginning. I saw people taking photos. All the slides are live. You could ping Dan or myself
on any channel that you wish and you could get
all the slides there. And if you have any questions,
we’ll be at our booth at the Assistant
App room, so please feel free to come over and give
us your feedback and thoughts. Thank you very much. [APPLAUSE] DANIEL IMRIE-SITUNAYAKE: Cheers. [MUSIC PLAYING]

5 thoughts on “Developing Conversational Assistant Apps Using Actions on Google (GDD Europe ’17)

  1. Thank you much!

  2. Hey how can I get if you see love my massage

  3. What about multi language

  4. Your A.I assistant, Google, has too much A.I by hacking my phone.how? Well it is fiddling around with my volume turning it up and down, it keeps turning on by itself and it is not installed anymore but it is on my phone. What did you program into Google assistant?!

  5. Hi (I have the below problem in my chatbot)
    When the user enters his 6 digit ID number the bot has to match the 6 digit id number with the first name of the user from the sql server and prompt his first name by greeting him. kindly help me in this step.

Leave a Reply

Your email address will not be published. Required fields are marked *