Dialogue management#

ARI uses the RASA chatbot engine for natural language processing. It is an open-source chatbot framework based on machine learning.

This documentation will focus on how the chatbot works on ARI, how to start or stop different chatbots and create new ones. Note there are some differences or particularities with respect to the regular RASA platform.

At high-level , and as described in the Speech and Language overview, the chatbot subscribes to the /humans/voices/*/speech topic for speech input, sends it to the RASA server, analyses it using the current active language, and generates outputs, either as ROS Intents, or directly using text-to-speech, calling the /tts ROS action.

Note

As of pal-sdk-23.1, ARI does not perform speaker voice separation. As such, only one speech topic is available, named /humans/voices/anonymous_speaker/speech and multi-party dialogue is not available off-the-shelf.

Voice spearation might be added to future SDK releases.

In this section we will take a brief tour of the ROS interfaces to manage the chatbots and how the RASA model is structured so you can then build your own chatbots for the robot.

Important

Follow Create an application with pal_app to quickly generate a robot application template that also includes a custom chatbot template.

We highly encourage you to start from that template to create your own chatbots.

More advanced usage such as using RASA actions is covered in triggering-actions-from-rasa.

Note

There are options to use different chatbot modules, e.g. DialogFlow from Google, for such requests contact PAL Robotics.

ROS interfaces of RASA#

Please refer to Chatbot/Dialogue management API page for the list of ROS actions, services and topics.

The robot comes with pre-configured chitchat chatbot models for a set of languages that are stored in ~/.pal/chatbots-enabled.

There is one sub-directory per language. For instance, the English model is located at ~/.pal/chatbots-enabled/en_US You can modify or add new chatbots there.

The RASA-specific training parameters (like the choice of an intent classifier, etc) are configured in ~/.pal/chatbot_cfg/rasa-chatbots/<lang tag>/config.yml.

Basic concepts of RASA#

There are 4 key concepts to know when talking about chatbot design with RASA:

Intent: What is the user intending to ask about?

Caution

Chatbot’s intents should not be confused with the system-wide robot Intents. The chatbot intents are solely extracted from input text. They are are internal to the chatbot, and are not visible outside of it.

The ‘system’ intents are published on the /intents topic, and are multi-modal (for instance, if a person tries to engage with the robot by getting closer, an ENGAGE_WITH intent will be published on the /intents topic.

Note however that a ‘system’ intent can be triggered from a ‘chatbot’ intent: for instance, if a user tells the robot “I want you go in the kitchen”, the chatbot might recognise a go_to intent in the sentence that will trigger the publication of a MOVE_TO intent on the /intents topic.

Entity: What are the important pieces of information in the user’s query?
Story: What is the possible way the conversation can go?
Action: What action should the robot take upon a specific request?

RASA has two main modules:

RASA NLU for understanding user messages. It detects Intents and Entities in your messages and may have different components to recognize them such as Spacy or Tensorflow.
RASA Core for holding conversations and deciding what to do next. It can predict dialogue as a reply and can trigger Actions as well.

RASA model structure#

The structure of a RASA chatbot is fully described here.

Take the example English chatbot located at ~/.pal/chatbots-enabled/en_US/.

You will find one ‘sub-chatbot’ per skill or feature you want to implement:

ari_chitchat: chitchat questions such as Hello, What is your name?, What is the weather?
ari_diagnostics: questions regarding the robot status, such as as What is your battery level?, Are you charging?

For a full list of what the robot understands by default refer to the last section of this chapter.

Users can create new sub-chatbots following the same data structure, as the when the robot is switched on it will automatically concatanate them all and train for the default language.

Taking ari_chitchat as example, we have:

actions.py: code for custom actions

nlu.yml: the NLU training data, where you define Intents, as well as related sentences the robot should recognize to match the Intent.

nlu:
- intent: greet
  examples: |
    - hey
    - hello
    - hi
    - hello there
    - good morning
    - good evening
    - hey there
    - let's go
    - hey dude
    - goodmorning
    - goodevening
    - good afternoon

domain.yml: the chatbot’s domain. It combines different Intents, which are declared in the beginning, that the chatbot can detect, and a list of replies. If you have created custom actions, they should be declared here.

intents:
- greet

responses:
  utter_greet:
  - text: Hi!
  - text: Hey, I am ARI
  - text: what's up

actions:
  - action_get_weather

rules.yml: optionally, describe short pieces of conversations that should always follow the same path.

In other words, in this example, we want the robot to greet the user, trigger the utter_greet response defined in the domain, everytime it hears a Hello or greet Intent.

rules:
- rule: Greeting Rule
  steps:
  - intent: greet
  - action: utter_greet

stories.yml: optionally, the chatbot may contain stories, that defines the flow of the conversation. The default chatbots of the robot for instance does not contain such a file. See Stories for more information

Note that the chatbot should always have at least a rules.yml or stories.yml file in order to train.

Trained models#

Before being available, chatbots must be trained. Use the /train_chatbot to train (or re-train after a modification) a chatbot.

Once trained, RASA creates a file models/<lang_code>.tar.gz, stored in ~/.pal/chatbot_cfg/rasa-chatbots/<lang_code>/models/.