Android & Kotlin Tutorials

Learn Android development in Kotlin, from beginner to advanced.

Building an Action for Google Assistant: Getting Started

In this tutorial, you’ll learn how to create a conversational experience with Google Assistant.

5/5 5 Ratings

Version

  • Other, Other, Other

Are you a fan of Google Assistant? Do you say things like, “OK Google, pass the salt!” at the dinner table? Have you asked Google Assistant random questions to see what it would say in reply?

Google Assistant is a voice assistant from Google found on mobile devices, Google Home, and more. It allows users to perform tasks and get information in a hands-free fashion via a Voice User Interface, or VUI. It allows users another way to interact with your brand other than your Android app. Google Assistant is on more than 500 million devices worldwide, and Google expects that number to reach one billion soon.

Did you know you can write your own actions for Google Assistant? In this tutorial, you’ll learn how to create your own conversational experience with Google Assistant.

VUI and Conversational Design

VUI design lets you create a natural and intuitive conversation for the user that can scale across different surfaces such as screen or speaker only. When designing a conversation, it’s important to consider the following:

  • Is VUI an appropriate means of accomplishing the task the action creates? Here’s a quiz to help you determine whether VUI is a good fit.
  • Who is the audience for this action?
  • What is the personality of your action? For example: An action created for a trendy surfboard shop will have a much different persona than an action created for a high-scale professional clothing company. Look at the difference between these two conversations:

"Aloha! Surf’s Up! Would you like to hear the status of  your custom surfboard order?" "That would be awesome!"

"Good afternoon,  what can I help you find today?" "I’m interested in ladies blouses,  size medium."

Designing an Action’s Conversation

A good conversation design includes the following components:

  • A Happy Path, or the shortest path through the conversation that accomplishes the task.
  • Conversation repair scenarios that allow the conversation to recover and continue in cases where the user says something unexpected or the user is not properly understood.
  • Opportunities for the user to exit the conversation gracefully.
  • Varied greetings when starting the action to keep the experience new and spontaneous for the user.

Getting Started

Get the projects by clicking the Download Materials button at the top or bottom of this tutorial. In this tutorial you’ll write an action to play the raywenderlich.com Podcast right from the assistant!.

You need:

  • An active Google account. Sign up for one here.
  • An Android phone/tablet logged in with the same Google account that you’ll use in this tutorial.

Note: It is helpful, but not required to have some understanding of Node, NPM, and Promises.

Creating the Action Project

You need to start off by creating a project to work with. Go to the Actions Console and login with the Google account you’d like to use for development. Click Add/import Project:
Add an Action in the Action Console

Then, accept the terms of agreement and name the project RWPodcast and click Create Project:
Enter project name

Scroll down to the More Options section. Select Conversational:
More Options Conversational

In the left-hand navigation pane, click Build ▸ Actions ▸ Add Your First Action:
Add your first Action from Actions Console

Select the Custom category on the left-hand pane and click Build:
Select custom category

This will take you to the Dialogflow console webpage. If prompted, select or login with your Google account, and click Allow. You might also need to accept terms of service.

Checking Permissions Settings

You need to check some permissions for your set up. Go to the Activity Controls page. Enable the following permissions:

Web & App Activity, checking Include Chrome History:
Web & Activity Permission

Device Information:
Device Information Permission

Voice & Audio Activity:
Voice & Audio Permission

Creating the Dialogflow Agent

Next, create your agent. The Dialogflow page will prompt you to create an auto-configured Dialogflow agent. Click Create.

Create Dialogflow Agent

Agents and Intents

Agents are Natural Language Understanding (NLU) modules. Agents translate what the user says into actionable data. When the utterance of the user matches one of the agent’s intents, the agent performs the translation of the user’s request into actionable data and returns the result to the user.

Intents match user input to the appropriate responses. In the training phrases of the intent, you:

  • Define examples of user utterances that can trigger the intent.
  • Specify what to extract from the utterance.
  • Specify how to respond.

Generally, an intent represents a single ‘turn’ in the conversation.

Intents consist of four different components:

  • Name: An identifier for the intent that reference by the fulfillment.
  • Training Phrases: A defined collection of example phrases that invoke a particular intent. Dialogflow will automatically match similar phrases with the ones provided.
  • Actions and Parameters: Define which parts of the user utterances to extract. These often include information such as dates, times, quantities and places.
  • Response: The utterance displayed or spoken back to the user.

Note: Dialogflow supports a feature related to intents called contexts. Contexts are used to have more control over intent matching and manage the state of the conversation over multiple intents. To learn more, check the documentation.

A typical agent has several intents that address different user intentions. When a Dialogflow agent hears an utterance from the user, it attempts to match that utterance to one of the training phrases defined in the intents. Then, the agent returns the response from that intent.

There are special types of intents. By default, a new agent includes two of these, fallback intent and the the welcome intent.

The agent invokes a fallback intent when the user says something that the agent can’t recognize.

The agent invokes a welcome intent when the user starts a conversation with the agent. The welcome intent informs the user what the action does or how to start a conversation.

Click on the Intents tab on the left pane. Then select Default Welcome Intent to see the default welcome intent:

Click Default Welcome Intent

Notice the predefined training phrases:
Pre defined training phrases

Also, notice the predefined responses lower on the page:
Predefined responses

You can also create custom intents, choose your own training phrases and define your own responses.

Lastly, there are follow-up intents. Follow-up intents nest below a parent intent. Use them to gather follow-up information.

Running your Action

Run the action in the Simulator by clicking IntegrationsIntegration Settings:
Test Action Integrations

Select the Auto-Preview Changes option and click Test:
Select Auto-Preview Changes and Click Test

Another tab will open. When Dialogflow finishes Updating the Action in the Actions on Google Console, you’ll see the action loaded in the Actions Simulator.

The default welcome and fallback intents are operational. To see what I mean, have a conversation with your action!

Select the Talk to my Test App Suggestion Chip to begin. You’ll see a friendly greeting randomly selected from the list of responses each time the action runs.
Test the App in the Simulator

Modifying the Welcome Intent

Time to try adding your first custom response! Return to the Dialogflow web browser tab. Select IntentsDefault Welcome Intent and scroll to the Responses section.

First, delete all the default welcome intent’s responses by selecting the trashcan:
Delete The Welcome Responses

Now, add your own responses. Make sure you click Save when you’re done!
Enter Your Own Welcome Responses

Finally, click IntegrationsIntegration SettingsTest to run the action. Click the Suggestion Chips to Talk to my test app and Cancel a few times to see how the action randomly chooses between the custom welcome intent responses.
Test Action in Simulator with Custom Welcome Response

Testing Actions on a Device

In addition to the Simulator, Google Assistant allows you to run your action on a device. Open Google Assistant on most Android devices by long pressing home. Swipe up on the Assistant to open it.

Note: This guide explains how to use Google Assistant on different platforms including iOS.

Make sure the Assistant is logged in with the account you’re using for development. To change accounts, click your account avatar in the top right corner. Then, click Account and select the appropriate account. If the development account isn’t on your device, add it through the device settings and try again.
Change Accounts In Assistant

Note: Make sure your phone locate is set to English (US) to find your app.

By typing or speaking, tell Google Assistant to Talk to my test app.

Talk To My Test Action Device

Google Assistant will run your test action.
Test Action Running In Device

Now that you’ve modified your welcome intent and tested your action, you need to upload the Dialogflow project from the starter project to start developing your own intents. Keep reading to learn how!

Uploading the Dialogflow Agent

To upload the preconfigured agent in the sample project, select Gear Icon ▸ Export and Import ▸ Restore From Zip.
Upload Dialogflow Starter

Drag and drop to attach or browse to the file RWPodcast.zip from the starter materials for this tutorial. If you haven’t downloaded them yet, you can download them using the Download Materials button at the top or bottom of this tutorial. Then, type RESTORE in the appropriate text box and click Restore. After it uploads, click Done.
Upload agent window

Now that you know how to get started building a Dialogflow agent, it’s time to learn how to fulfill more complicated user natural language requests by developing Fulfillment.

Fulfillment

Fulfillment for Google Assistant is code deployed as a webhook. Each intent in the agent has corresponding business logic in the webhook. Information extracted by the agent can generate dynamic responses or trigger actions on the back end.

Most actions require fulfillment to extend an agent’s capabilities and perform tasks such as returning information from a database, implementing game logic or placing orders for the customer.

You can implement a simple webhook inside Dialogflow by utilizing the Inline Editor. More complicated actions benefit from a different approach.

Setting up the Local Development Environment

Due to limitations of the Inline Editor, it’s best to do development for the webhooks locally. To do so, set up the required libraries with NPM.

Note: NPM typically comes with Node.js. You can find instructions for how to install both here.

Installing the Firebase CLI Libraries

To install the Firebase CLI libraries, open a terminal and run the command:

npm -g install firebase-tools

Note: If the install doesn’t work, you may have to change the NPM permissions. Find more information here.

Test to see if the Firebase CLI install was successful by running the command:

firebase --version

After installing the CLI, login to Firebase by running the command:

firebase login

A web browser will launch for authorization. Login with the Google account that you’re using for development. Allow the required permissions.
Allow Firebase Login

After logging in successfully, the browser displays following message:
Firebase Cli Login Successful

You’re ready for the next steps!

Selecting the Project in Firebase

First, find your project’s ID in Dialogflow by selecting gear iconGeneralGoogle Project SectionProject ID. Retain this project ID for the next step.

Next, you need to set up the correct project in the Firebase CLI to deploy the local functions. In the terminal, open the functions folder in the starter project directory. You can do this by using cd path/to/directory replacing the path with wherever you have the project downloaded. Then, run the following command, replacing PROJECT_ID with the project ID you collected in the previous step:

firebase use PROJECT_ID

This command tells the Firebase CLI which project is currently selected in the terminal.

Installing Additional NPM Libraries

You’ll also need an NPM library to parse the RSS feed for the podcast. Install the NPM rss-parser library with the command:

npm install --save rss-parser

Now run the command:

npm install

This command installs the NPM modules.

Finally, deploy the functions to Firebase:

firebase deploy --only functions

Upon a successful deploy, copy the Function URL:
Function URL from Firebase

Implementing the Webhook

In Dialogflow under the Fulfillment tab, enable the Webhook and enter the URL copied in the last step. Finally, click Save.
Enter Function URL

Open the Welcome Intent again under Intents and notice that under the Fulfillment section the Webhook is enabled.
Welcome Intent Fullfillment Enabled

This informs the Dialogflow agent to run the code found in the Welcome Intent function in index.js instead of utilizing the default text responses defined in Dialogflow. Open index.js in your favorite text editor and find this code:

app.intent('Welcome Intent', (conv) => {
  conv.ask("Hello World!");  
});

The above code uses the conv.ask command to say the words Hello World! and wait for a response. Test the action in the Simulator or on your device. This time it replies “Hello World!”
Test App Local Dev Environment
Now you’ve set up your local development environment. Each time you make changes in the index.js file, save the file and run the command firebase deploy --only functions to upload and apply the changes. Then relaunch the simulator. On a device, the most current version of the action deploys automatically.

Creating and Fulfilling a Custom Intent

Creating the Intent

In Dialogflow, go to IntentsCreate Intent or press the plus sign next to Intents. Enter the name play_the_latest_episode and add the Training Phrases shown below:
CustomIntent Add Training Phrases
Enable webhook call for this intent under Fulfillment and Save:
Enable Webhook

Fulfilling the Intent

Add this intent to the in index.js below the welcome intent:

app.intent('play_the_latest_episode', (conv) => {
  //1
  let parser = new Parser()
  //2
  return parser.parseURL(
    'https://www.raywenderlich.com/feed/podcast').then((feed) => {
    //3
    let latestCast = feed.items[0];
    //4
    conv.close(new SimpleResponse({
      speech: "Here is the latest episode",
      text: "Here is the latest episode"
    }));
    //5
    conv.close(new MediaObject({
      name: latestCast.name,
      url: latestCast['enclosure']['url'],
      description: latestCast.description,
      icon: new Image({
        url: 
          'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
        alt: 'RW Logo',
      }),
    }));
  }).catch((err) => {
    // handle errors
    console.log(err)
    conv.close("An Error Occurred Parsing the RSS Feed, Try Again Later");
  });
});

What’s going on in the code above?

  1. The code creates an RSS feed Parser and uses it to connect to the podcast URL.
  2. The request is made in the form of a Promise. You can think of this like a callback.
  3. If the Promise is fulfilled, the latest item is taken from the feed and assigned to latestCast.
  4. new SimpleResponse contains speech or text to show the user.
  5. new MediaObject populates with fields from latestCast, including the description, a URL to an image and the link to the audio file that is retrieved using url: latestCast['enclosure']['url']. When utilizing a MediaObject, place a SimpleResponse directly before it.

If the conversation remains open after sending a MediaObject , follow with Suggestion Chips. Since this example uses conv.close, Suggestion Chips aren’t necessary. You’ll see how to create them later.

Unfulfilled promises prompt an error message in response.

Save index.js, and in the terminal, and execute firebase deploy --only functions to deploy the changes.

Troubleshooting Errors in the Firebase Console

You can try to talk to your test app again, but you’ll get an error when trying to play the next episode. In order to do an external request, the Firebase project paired with the action must be on a billing tier other than the Free plan.

Login to the Firebase Console here and select the project associated with your action. Under DevelopmentFunctionsLogs, you should see console.log output and error messages.

The error message indicates that making an external request requires upgrading the plan first. Select Upgrade and choose the Blaze plan. The Blaze pay as you go plan does not incur a charge while testing.

Error In Firebase

Testing the Intent

Now, run the test action again. When the action welcomes you, respond with Latest episode. The latest episode of the podcast should play. Notice the Media player that displays.

Run the Action Latest Episode

Detecting Surfaces

You can detect what sort of device an action is initiated from. Replace the conv.ask('Hello World!') statement in the welcome intent in index.js with the following:

//1
if (!conv.surface.capabilities
  .has('actions.capability.SCREEN_OUTPUT')) {
  if(!conv.surface.capabilities
    .has('actions.capability.MEDIA_RESPONSE_AUDIO')) {
    conv.close("So sorry, this device does not support media playback!");
    return;
  }
  //2
  else
  {
    conv.ask("Greetings, Learner!");
    //3
    conv.ask(new Confirmation(
      "Would you like to play the latest episode?"));
  }
}

Here are some things to notice about the code above:

  1. If the current conversation doesn’t have actions.capability.SCREEN_OUTPUT and MEDIA_RESPONSE_AUDIO, the device doesn’t support media playback.
  2. The device supports audio, but it has no screen. If this is the case, ask the user whether or not to play the latest episode. This is to make the action more convenient on a hands-free, screen-free surface that relies on voice control.
  3. new Confirmation is a Helper that asks for a yes or no response.

Note: The play_latest_episode_confirmation Intent in Dialogflow has an actions_intent_confirmation Event. An Event is another way to trigger an intent with predefined values such as yes or no.

Find the play_latest_episode_confirmation in the Dialogflow Intents list. Enable the webhook call for this intent and Save the intent.

Handling a Confirmation Event

Since you created a Confirmation above, you need to handle the response. Add the code shown below to index.js:

app.intent('play_latest_episode_confirmation',
(conv, input, confirmation) => {
  // 1
  if (confirmation) {
    let parser = new Parser()
    return parser.parseURL(
      'https://www.raywenderlich.com/feed/podcast').then((feed) => {
      let latestCast = feed.items[0];
      conv.close(new SimpleResponse({
        speech: "Here is the latest episode",
        text: "Here is the latest episode"
     }));
     conv.close(new MediaObject({
        name: latestCast.name,
        url: latestCast['enclosure']['url'],
        description: latestCast.description,
        icon: new Image({
          url:
          'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
          alt: 'RW Logo',
        }),
      }));
    }).catch((err) => {
      // handle errors
      console.log(err)
      conv.close("An Error Occurred Parsing the RSS Feed, Try Again Later");
  });
  //2
  } else {
    conv.ask("You can say 'play latest episode' or 'play an episode about a subject' such as Kotlin or iOS.");
  }
})

Here’s what’s happening above:

  1. If the confirmation parameter is true, the user confirmed they want to play the podcast. You then fetch and play the episode the same way you do in the play_the_latest_episode intent.
  2. If not, the user doesn’t want to play the podcast. Prompt the user to trigger other intents.

Save index.js, and in the terminal, and execute firebase deploy --only functions to deploy the changes. Open the test app in the Simulator.

Select Speaker as the Surface and walk through the conversation to play the latest episode. You should also try walking through saying “no” when prompted:
Run the Action Speaker Simulater Latest Episode

Using an Option List

You can also set up an option list to give the user some options for what they can do. Add the else block below the outer if block in the welcome intent:

else {
    let parser = new Parser()
    return parser.parseURL(
      'https://www.raywenderlich.com/feed/podcast').then((feed) => {
      let cast1 = feed.items[0];
      let cast2 = feed.items[1];
      let cast3 = feed.items[2];
      conv.ask("Here are the three latest episodes!");
      conv.ask(new List({
        title: 'Latest Episodes',
        items: {
          [EPISODE_ONE]: {
            synonyms: ['Podcast 1'],
            title: cast1.title,
            description: cast1.description,
            image: new Image({
              url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
              alt: 'RW Podcast Logo',
            }),
          },
          [EPISODE_TWO]: {
            synonyms: ['Podcast 2'],
            title: cast2.title,
            description: cast2.description,
            image: new Image({
              url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
              alt: 'RW Podcast Logo',
            }),
          },
          [EPISODE_THREE]: {
            synonyms: ['Podcast 3'],
            title: cast3.title,
            description: cast3.description,
            image: new Image({
              url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
              alt: 'RW Podcast Logo',
            }),
          },
        },
      }));
      conv.ask(new Suggestions([
        'Latest episode', 
        'Episode about Kotlin',
        'Episode about iOS']));
    }).catch((err) => {
      console.log(err)
      conv.close("An Error Occurred Parsing the RSS Feed, Try Again Later");
    });
  }

The code above creates a list that displays the three most recent podcasts. It then uses Suggestions to provide some suggestion chips.

get_episode_option, which you’ll implement next, handles the event when the user selects a podcast from the list. The user may also select one of the suggestion chips required to provide after a list.

Note: In Dialogflow, be sure to add actions_intent_option as an Intent and enable the web hook.

Add the following code to handle the get_episode_option intent:

app.intent('get_episode_option', (conv, input, option) => {
    let parser = new Parser()
    return parser.parseURL(
      'https://www.raywenderlich.com/feed/podcast').then((feed) => {
    let cast1 = feed.items[0];
    let cast2 = feed.items[1];
    let cast3 = feed.items[2];
    if(option === EPISODE_THREE){
      conv.ask('Here is the third episode from the list');
      conv.close(new MediaObject({
        name: cast3.title,
        url: cast3['enclosure']['url'],
        description: cast3.description,
        icon: new Image({
          url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
          alt: 'RW Logo',
        }),
      }));
    }
    else if(option === EPISODE_TWO){
      conv.ask('Here is the 2nd episode from the list');
      conv.close(new MediaObject({
        name: cast2.title,
        url: cast2['enclosure']['url'],
        description: cast2.description,
        icon: new Image({
          url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
          alt: 'RW Logo',
        }),
      }));
    }
    else{
      conv.close(new SimpleResponse({
          speech: "Here is the first episode from the list",
          text: "Here is the first episode from the list"
      }));
      conv.close(new MediaObject({
        name: cast1.title,
        url: cast1['enclosure']['url'],
        description: cast1.description,
        icon: new Image({
          url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
          alt: 'RW Logo',
        }),
      }));
     }
  }).catch((err) => {
     // handle errors
     console.log(err)
     conv.close("An Error Occurred Parsing the RSS Feed, Try Again Later");
  });
});

In the code above, the value of the option parameter is checked so the correct MediaObject can be created. It them passes the MediaObject to the conv object as a response.

Save and deploy the changes. Then, test the app on a phone to see the list.
List On Screen Surface

When the user selects a podcast from the list, it plays. Nice, right?
Selected Podcast From List

Entities and Parameters

A Training Phrase often contains useful data such as words or phrases that specify things like quantity or date. You use Parameters to represent this information. Parameters in training phrases are represented by an Entity. Entities identify and extract useful data from natural language inputs in Dialogflow.

Entities

Entities extract information such as date, time, color, ordinal number and unit.

The Entity Type defines the type of extracted data. Each parameter will have a corresponding Entity.

For each type of entity, there can be many Entity Entries. Entries are a set of equivalent words and phrases. Sample entity entries for the subject of a technical podcast might be iOS, iPhone, and iPad.

System Entities are built in to Dialogflow. Some system entities include date, time, airports, music artists and colors.

For a generic entity type, use Sys.Any, representing any type of information. If the system entities are not specific enough, define your own Developer Entities.

Below is an example of a developer entity defined to represent the subject of a technical podcast.

Podcast Subject Entity

You can see the @Subject Entity created when you uploaded the sample project by selecting Entities in the left pane.

Action and Parameters

A training phrase recognized as an entity is highlighted and designated a parameter, and then it appears in a table below the training phrases.

Dialogflow automatically identifies and tags some more common types of system entities as parameters.

In the example below, Dialogflow identifies the time and date parameters and assigns them the correct type.

Training Phrases with Actions and Parameters

You can see an example using podcast subjects be going to Intentsplay_an_episode_about.

In Dialogflow, Action and Parameters are below the training phrases of an intent.

There are a couple of attributes for parameters:

  • A Required checkbox to determine whether the parameter is necessary.
  • The Parameter Name identifies the parameter.
  • The Entity or type of the parameter.
  • The Value of the parameter refers to the parameter in responses.

Actions and Parameters in Dialogflow

Parameters can appear as part of a text response:
Parameters in text response

Using an Entity and a Parameter

Time to try using the Entities and Parameters you learned about! In Dialogflow, enable the webhook on the play_an_episode_about intent. Then add an intent handler to index.js with the following code:

app.intent('play_an_episode_about', (conv, {Subject}) => {
  let parser = new Parser()
  return parser.parseURL(
    'https://www.raywenderlich.com/feed/podcast').then((feed) => {
    let latestCast = feed.items[0];
    console.log('subject ' + Subject);
    for(let i = 0; i < feed.items.length; i++){
      let entry = feed.items[i];
      if(entry.title.toUpperCase().indexOf(Subject.toUpperCase()) > -1){
        latestCast = entry;
        break;
      }
    }
    conv.ask(latestCast.title);
    conv.close(new MediaObject({
      name: latestCast.name,
      url: latestCast['enclosure']['url'],
      description: latestCast.description,
      icon: new Image({
        url: 'https://koenig-media.raywenderlich.com/uploads/2016/02/Logo.png',
        alt: 'RW Logo',
      }),
    }));
  }).catch((err) => {
    console.log(err)
    conv.close("An Error Occurred Parsing the RSS Feed, Try Again Later");
  });
});

The code above searches the feed for the newest episode. Here’s how:

  • The code looks for the value of the Subject parameter.
  • The user utterance provides the parameter.
  • The Entity defined in Dialogflow provides the utterance.

The first podcast containing the Subject keyword will be passed into the MediaObject and played.

Selected Podcast By SubjectSuggsetionChip

Where to Go From Here?

Wow, that was a lot of work! You’re awesome!

In this tutorial, you created a Google Action in the Google Actions Console and utilized Dialogflow to design a conversation for that action. You provided fulfillment for the action by implementing a locally developed webhook in node.js, utilizing promises.

Get the final project by clicking the Download Materials button at the top or bottom of this tutorial. If you want to keep moving forward with this, here are some suggestions:

  • You can learn more about publishing your action here.
  • Within 24 hours of publishing your action, the Analytics section of the Actions Console will display the collected data. Use the Analytics section to perform a health check and get information about the usage, health and discovery of the action. Learn more here.
  • Read more about the NPM rss-parser extension we used in this tutorial here.
  • Want to go even deeper into Google assistant? Google has an extensive Conversation Design Guide. Find Google Codelabs about the Assistant here and a few Google Samples here. Google has a complete documentation.

If you have any questions or comments, please join the forum discussion below!

Average Rating

5/5

Add a rating for this content

5 ratings

Contributors

Comments