Dialogue management#
ARI uses the RASA chatbot engine for natural language processing. It is an open-source chatbot framework based on machine learning.
This documentation will focus on how the chatbot works on ARI, how to start or stop different chatbots and create new ones. Note there are some differences or particularities with respect to the regular RASA platform.
At high-level , and as described in the Speech and Language overview, the chatbot subscribes to the /humans/voices/*/speech topic for speech input, sends it to the RASA server, analyses it using the current active language, and generates outputs, either as ROS Intents, or directly using text-to-speech, calling the /tts ROS action.
Note
As of pal-sdk-23.1
, ARI does not perform speaker voice separation. As such,
only one speech topic is available, named
/humans/voices/anonymous_speaker/speech
and multi-party dialogue is not
available off-the-shelf.
Voice spearation might be added to future SDK releases.
In this section we will take a brief tour of the ROS interfaces to manage the chatbots and how the RASA model is structured so you can then build your own chatbots for the robot.
Important
Follow Create an application with pal_app to quickly generate a robot application template that also includes a custom chatbot template.
We highly encourage you to start from that template to create your own chatbots.
More advanced usage such as using RASA actions is covered in triggering-actions-from-rasa.
Note
There are options to use different chatbot modules, e.g. DialogFlow from Google, for such requests contact PAL Robotics.
ROS interfaces of RASA#
Please refer to Chatbot/Dialogue management API page for the list of ROS actions, services and topics.
The robot comes with pre-configured chitchat chatbot models for a set of
languages that are stored in ~/.pal/chatbots-enabled
.
There is one sub-directory per language. For instance, the English model is located at ~/.pal/chatbots-enabled/en_US
You can modify or add new chatbots there.
The RASA-specific training parameters (like the choice of an intent classifier,
etc) are configured in ~/.pal/chatbot_cfg/rasa-chatbots/<lang tag>/config.yml
.
Basic concepts of RASA#
There are 4 key concepts to know when talking about chatbot design with RASA:
Intent: What is the user intending to ask about?
Caution
Chatbot’s intents should not be confused with the system-wide robot Intents. The chatbot intents are solely extracted from input text. They are are internal to the chatbot, and are not visible outside of it.
The ‘system’ intents are published on the
/intents topic, and are multi-modal (for instance, if a person
tries to engage with the robot by getting closer, an ENGAGE_WITH
intent
will be published on the /intents
topic.
Note however that a ‘system’ intent can be triggered from a ‘chatbot’ intent:
for instance, if a user tells the robot “I want you go in the kitchen”, the
chatbot might recognise a go_to
intent in the sentence that will trigger
the publication of a MOVE_TO
intent on the /intents
topic.
Entity: What are the important pieces of information in the user’s query?
Story: What is the possible way the conversation can go?
Action: What action should the robot take upon a specific request?
RASA has two main modules:
RASA NLU for understanding user messages. It detects Intents and Entities in your messages and may have different components to recognize them such as Spacy or Tensorflow.
RASA Core for holding conversations and deciding what to do next. It can predict dialogue as a reply and can trigger Actions as well.
RASA model structure#
The structure of a RASA chatbot is fully described here.
Take the example English chatbot located at ~/.pal/chatbots-enabled/en_US/
.
You will find one ‘sub-chatbot’ per skill or feature you want to implement:
ari_chitchat
: chitchat questions such as Hello, What is your name?, What is the weather?ari_diagnostics
: questions regarding the robot status, such as as What is your battery level?, Are you charging?
For a full list of what the robot understands by default refer to the last section of this chapter.
Users can create new sub-chatbots following the same data structure, as the when the robot is switched on it will automatically concatanate them all and train for the default language.
Taking ari_chitchat
as example, we have:
actions.py
: code for custom actions
nlu.yml
: the NLU training data, where you define Intents, as well as related
sentences the robot should recognize to match the Intent.
nlu:
- intent: greet
examples: |
- hey
- hello
- hi
- hello there
- good morning
- good evening
- hey there
- let's go
- hey dude
- goodmorning
- goodevening
- good afternoon
domain.yml
: the chatbot’s domain. It combines different Intents, which are
declared in the beginning, that the chatbot can detect, and a list of replies.
If you have created custom actions, they should be declared here.
intents:
- greet
responses:
utter_greet:
- text: Hi!
- text: Hey, I am ARI
- text: what's up
actions:
- action_get_weather
rules.yml
: optionally, describe short pieces of conversations that should
always follow the same path.
In other words, in this example, we want the robot to greet the user, trigger
the utter_greet
response defined in the domain, everytime it hears a
Hello or greet
Intent.
rules:
- rule: Greeting Rule
steps:
- intent: greet
- action: utter_greet
stories.yml
: optionally, the chatbot may contain stories, that defines the
flow of the conversation. The default chatbots of the robot for instance does
not contain such a file. See Stories for
more information
Note that the chatbot should always have at least a rules.yml
or
stories.yml
file in order to train.
Trained models#
Before being available, chatbots must be trained. Use the /train_chatbot to train (or re-train after a modification) a chatbot.
Once trained, RASA creates a file models/<lang_code>.tar.gz
, stored in ~/.pal/chatbot_cfg/rasa-chatbots/<lang_code>/models/
.