Dialogue management#

ARI uses the RASA chatbot engine for natural language processing. It is an open-source chatbot framework based on machine learning.

This documentation will focus on how the chatbot works on ARI, how to start or stop different chatbots and create new ones. Note there are some differences or particularities with respect to the regular RASA platform.

At high-level, and as described in the Speech and Language overview, the chatbot subscribes to the /humans/voices/*/speech topic for speech input, sends it to the RASA server, analyses it using the current active language, and generates outputs, either as ROS Intents, or directly using text-to-speech, calling the /tts ROS action.

Note

As of pal-sdk-23.1, your robot does not perform speaker voice separation. As such, only one speech topic is available, named /humans/voices/anonymous_speaker/speech and multi-party dialogue is not available off-the-shelf.

Voice separation might be added to future SDK releases.

In this section we will take a brief tour of the ROS interfaces to manage the chatbots and how the RASA model is structured so you can then build your own chatbots for the robot.

More advanced usage such as using RASA actions is covered in triggering-actions-from-rasa.

Note

You can optionally use alternative chatbot engines, e.g. DialogFlow from Google. For such requests contact PAL Robotics.

Basic concepts of RASA#

There are 4 key concepts to know when talking about chatbot design with RASA:

  • Intent: What is the user intending to ask about?

Caution

Chatbot’s intents should not be confused with the system-wide robot Intents. The chatbot intents are solely extracted from input text. They are internal to the chatbot, and are not visible outside of it.

The ‘system’ intents are published on the /intents topic, and are multi-modal (for instance, if a person tries to engage with the robot by getting closer, an ENGAGE_WITH intent will be published on the /intents topic.

Note however that a ‘system’ intent can be triggered from a ‘chatbot’ intent: for instance, if a user tells the robot “I want you go in the kitchen”, the chatbot might recognise a go_to intent in the sentence that will trigger the publication of a MOVE_TO intent on the /intents topic.

  • Entity: What are the important pieces of information in the user’s query?

  • Stories or Rules: What is the possible way the conversation can go?

  • Action: What action should the robot take upon a specific request?

RASA has two main modules:

  1. RASA NLU for understanding user messages. It detects Intents and Entities in your messages and may have different components to recognize them such as Spacy for Pretrained Embeddings or Tensorflow for Supervised Embeddings.

  2. RASA Core for holding conversations and deciding what to do next. It can predict dialogue as a reply and can trigger actions as well.

Chatbot domain configuration#

Configuration files#

On the robot, the RASA chatbot engineaccess the following files, under ~/.pal/:

chatbot_cfg/
├── rasa-chatbots/en_GB/
│   ├── credentials.yml
│   ├── config.yml
│   ├── endpoints.yml
│   ├── data/ -> chatbots-enabled/en_GB
│   └── model/en_GB.tar.gz
└── rasa-chatbots/xx_XX
    └── ...

chatbots-enabled/
├── en_GB/
│   ├── basic_commands/
│   ├── chitchat/
│   ├── diagnostics/
│   ├── pretrained_model/
│   └── actions_i8n.yml
├── xx_XX/
│   └── ...
└── common
    ├── action_get_weather.py
    └── action_get_ip.py

Note

Most of these files correspond to standard RASA configuration files. The full description of each of these files is available here.

PAL robots come with three pre-configured chatbot domains (see below for details):

  • chitchat: smalltalk with the robot;

  • diagnostics: diagnostics-related queries;

  • basic_commands: recognise basic commands and translate them into ROS intents, published on the /intents.

Depending on the languages available on your robot, each of these domains are visible under ~/.pal/chatbots-enabled/xx_XX where xx_XX is the locale code (e.g. en_GB).

Note

If you can not find the directory for the specific language you wish to use it means that it is not available. Contact us if you need support for additional languages. You can also check Tutorial: create, translate or update a chatbot to learn how to modify/translate yourself a chatbot.

Additional (language-independent) RASA training parameters (like the choice of an intent classifier, etc.) are configured in ~/.pal/chatbot_cfg/rasa-chatbots/xx_XX/:

  • config.yml configuration of your NLU and Core models. In-case you are dealing with Tensorflow or Spacy, you need to define such pipeline here. To handle this file, you show know about Machine Learning and Deep Learning.

  • credentials.yml details for connecting to other services. For example, if you wish to integrate with external software like Facebook Mesenger, you can add their credentials here. The pre-trained chatbots on your robot only expose RASA as an endpoint here.

rasa:
  url: "http://localhost:5002/api"
  • endpoints.yml details for the configuration settings that define how the different components of a Rasa chatbot interact with each other and with external systems. The pre-trained chatbots on the robot only require the action server endpoint to be configured.

action_endpoint:
  url: "http://localhost:5055/webhook"

These 3 files should be generally identical accross different chatbots, modifying config.yml only if you wish to apply more advanced chatbot techniques.

Default chatbots#

On the other hand, in the ~/.pal/chatbots-enabled/en_GB directory, you will find one ‘sub-chatbot’ per skill or feature that has been implemented. By default:

  • chitchat: chatbot feature to have small talks with the user, e.g. “how are you doing?”, “what is your name?” (see Default chit-chat/smalltalk capabilities of ARI for the complete list);

  • diagnostics: chatbot feature to answer questions regarding the status of the robot, e.g. “are you charging?”, “waht is your IP?”, “what language can you speak?”;

  • basic_commands: chatbot feature to answer simple commmands requested by the user, e.g. “bring me the bottle”, “move to the kitchen”, and “show me more about you langage capabilities”. The commands are recognised as ROS intents and published on the /intents.

Note

As of pal-sdk-23.1, only the family of “show me some content…” basic commands are recognised by default by the chatbot. This commands are use e.g. for verbal navigation in the robot welcome demo.

Chatbot actions#

An essential part of the chatbot is the actions definition. Indeed, in RASA is possible to define custom actions as responses to specific intents (as listed at the bottom of the chatbot’s domain.yml file).

We provide a set of custom actions, specific to the robot, in the folder: ~/.pal/chatbots-enabled/common. Each action is a short Python script, automatically discovered and loaded by the rasa_action_server node.

Note

We do not use RASA’s default action server. Our own rasa_action_server is lighter and allows dynamic loading custom Python actions.

For a more in depth explanation of how to trigger a custom action please have a look at triggering-actions-from-rasa.

In each ~/.pal/chatbots-enabled/xx_XX directory, you will find a actions_i8n.yaml file. This file contains the translations of our custom actions into each language.

Chatbot example#

To have an idea of what the chatbot model looks like, let’s consider the chitchat as example. We have:

  • domain.yml: the chatbot’s domain. It combines different intents, which are declared in the beginning, that the chatbot can detect, and a list of replies. If you have created custom actions, they should be declared here (e.g., action_get_weather).

intents:
- greet

responses:
  utter_greet:
  - text: Hi!
  - text: Hey, I am your robotic companion
  - text: what's up

actions:
  - action_get_weather

The action action_get_weather is implemented in action_get_weather.py, that you can find in chatbots-enabled/common.

  • nlu.yml: the NLU training data, where you define intents, as well as related sentences the robot should recognize to match the intent. For instance:

nlu:
- intent: greet
  examples: |
    - hey
    - hello
    - hi
    - hello there
    - good morning
    - good evening
    - hey there
    - let's go
    - hey dude
    - goodmorning
    - goodevening
    - good afternoon
  • rules.yml: optionally, describe short pieces of conversations that should always follow the same path.

In other words, in this example, we want the robot to greet the user, trigger the utter_greet response defined in the domain, everytime it hears a Hello or greet intent.

rules:
- rule: Greeting Rule
  steps:
  - intent: greet
  - action: utter_greet
  • stories.yml: optionally, the chatbot may contain stories, that defines the flow of the conversation. The default chatbots of the robot for instance does not contain such a file. See Stories for more information.

Note

A chatbot should always have at least a rules.yml or stories.yml file in order to train.

Training models#

Before being available, chatbots must be trained. All supported languages come with pre-trained models, so you do not have anything particular to do.

If you want to modify/customize a chatbot, you will need to train yourself the chatbot model, something that currently require using the command-line.

See Tutorial: create, translate or update a chatbot for a tutorial.

Once trained, RASA creates a file xx_XX.tar.gz, stored in ~/.pal/chatbot_cfg/rasa-chatbots/xx_XX/models/.

Note

A single pre-trained model is produced that includes all the chatbots defined under chatbots-enabled/<lang>/.

For instance, if you have chatbots-enabled/en_GB/chitchat, chatbots-enabled/en_GB/my_custom_chatbot, a single en_GB.tar.gz will be generated during the RASA training, that will include both chatbots.

ROS interfaces to the chatbot#

Please refer to Chatbot/Dialogue management API page for the list of ROS actions, services and topics.