Intents is the general mechanism used on ARI to aggregate user commands, and present them to the robot’s application controller.

They are published on the /intents.

What are intents?#

An intent is an abstract description of an operation to be performed by the robot. Intents are represented as ROS messages of type hri_actions_msgs/Intent, and published on the /intents topic.

While inspired by the Android intents [android-intents], ROS intents are primarily designed to capture user-initiated intents. For instance, a button click on a touchscreen, the result of a chatbot-based verbal interaction, a command started by a remote user interface.

Intents are emitted (published) by nodes that track the user’s activities (eg, the touchscreen, the dialogue manager), and are consumed by the application controller.

You can learn more about how to program applications for the robot here: ARI app development.

Structure of an intent#

Intents comprise of four mandatory fields:

  • the intent, which should be one of the available predefined intents,

  • the data which must be a JSON object containing the data required to fully instantiate the intent.

  • the source of the intent (for instance, an user)

  • and the modality by which the intent was conveyed to the robot.

Optionally, you can also specify a priority and a level of confidence.

Intent name and data#

Intents are primarily composed of an intent name and data to parametrise the intent.

The intent field is a string describing the action intended by this intent.

Where suitable, the intent name SHOULD be one of the constant defined in the table below. However, we recognise that the list of intents is possibly large. Therefore, custom strings are also permissible.


Possible terminology confusion

Even though an intent describes a desired action, the intent field is unrelated to ROS actions. Here, the intent is the intended action to be performed (going somewhere, picking something…), while ROS actions are a low-level asynchronous remote procedure call (RPC) technique.

They are not actually related.

The intent’s data is a JSON object containing the data required to fully specify the intent. The keys of the object should be one of the following thematic role, or the generic other_data:

  • agent: the agent expected to perform the intent (if omitted, the robot itself is assumed)

  • object (also named theme or patient in the linguistics literature): entity undergoing the effect of the intent

  • goal: entity towards which the intent is directed or moves

  • recipient: entity that receives the object


  • “I want you to go to the kitchen”:

    • intent: MOVE_TO

    • data: {"goal":"kitchen_1"}

  • “Can you take the groceries to Luke in the kitchen?”:

    • intent: BRING_OBJECT

    • data: {"object": "groceries", "goal":"kitchen_1", "recipient": "person_luke"}


Additional complete examples of intents are provided below: Examples of intents.

Each intent defines a specific set of required and optional thematic roles, listed in the following table (note that the agent role can be optionally added to all intents, and is omitted from the table for clarity):



Required thematic roles

Optional thematic roles


an agent wants to engage with another one

  • recipient


navigates to a specific location

  • goal


guides someone somewhere

  • goal

  • recipient


pick-up a specific object

  • object


bring a specific object to a specific place

  • object

  • recipient


put an object on a support (eg a table)

  • recipient

  • object (only required if more that one object could be placed)


greet an agent

  • recipient


says some text, optionally annotated with gestures or expressions

  • object (the text to say)

  • recipient


present (via a screen, pre-recorded text…) predefined content

  • object (the content identifier)

  • recipient


performs a motion (eg, a dance or a specific gesture like pointing, waving)

  • object (the system-specific name of the motion/gesture)

  • recipient


start a scripted behaviour/activity

  • object (the name of the activity)

  • any additional parameter required to start the activity


request cancellation of an activity

  • object (the name of the activity)

  • object (the name of the activity. If unset, current main activity)


If you believe your intent should be standardised and added to the list of pre-defined intents, fill the corresponding entry in the “thematic roles” table below and submit a pull request on the hri_actions_msg repository.

Source of the intent#

The source of the intent is a string describing who created this intent. This is not the node which published the intent, but instead the actual agent who expressed the intent/command/desire. source can be either one of the constant below, or the specific id of the person/agent expressing the intent. In a REP-155 compliant system, this ID must be the person ID of the agent.

# for intents originating from the robot itself
string ROBOT_ITSELF = "__myself__"
# for intents originating from a external robot control system (for instance, a remote control tablet)
string REMOTE_SUPERVISOR = "__remote_supervisor__"
# for intents coming from an agent interacting with the robot, but not uniquely
# identified
string UNKNOWN_AGENT = "__unknown_agent__"
# for unknown sources
string UNKNOWN = "__unknown__"

Modality of the intent#

The intent’s modality conveys how the intent was expressed: verbally, via the touchscreen, via a gesture, etc.

The special modality MODALITY_INTERNAL must be used for intents coming for the robot’s internal processes, when applicable.

The modality field MUST be one of the MODALITY_ constant below.

string MODALITY_SPEECH = "speech"
string MODALITY_MOTION = "motion"
string MODALITY_TOUCHSCREEN = "touchscreen"
string MODALITY_INTERNAL = "internal"
string MODALITY_OTHER = "other"

Intent priority#

The priority of this intent. This MIGHT be used as a hint by the robot’s application controller to prioritise appropriately the intent. The application controller is however not forced to respect this priority level.

0 is the lowest priority, 128 is the default priority, 255 is the highest priority.

Intent confidence#

The intent’s confidence is a value between 0.0 (no confidence) and 1.0 (full confidence) that the intent was correctly perceived and interpreted.

For instance, a ‘waving’ gesture could be interpreted as an implicit request from a user for the robot to greet back or engage. As this interpretation is not certain, the confidence of the intent may be below 1.0.

Examples of intents#

User approaches the robot#

  • Possible intent trigger: user less than 2 meters away, looking at the robot

  • Possible published intent:

data: {"recipient": "anonymous_person_a2f5"}
source: "anonymous_person_a2f5"
priority: 128
confidence: 0.6

User presses a button on the touchscreen to navigate#

  • Possible intent trigger: button press on “Go to room X”

  • Possible published intent:

intent: MOVE_TO
data: {"goal": "room_X"}
priority: 200
confidence: 1.0

User presses a button on the touchscreen to play game#

  • Possible intent trigger: button press: “Play memory game”

  • Possible published intent:

data: {"object": "games_memory_game"}
priority: 100
confidence: 1.0

User presses a button on the touchscreen to display page#

  • Possible intent trigger: button press: “Go to page_X”

The result depends on what page_X is about:

  • if page_X is a purely informational page, that does not require any additional robot capability (eg, does not requires the robot to speak or to move), no intent needs to be generated. As this action is ‘read-only’ with no impact on the robot, it can be handled directly by the touchscreen.

  • page_X requires additional robot resources. In this case, an intent needs to be published:

  • Possible published intent:

data: {"object": "page_X"}
priority: 100
confidence: 1.0

User asks the robot to display a specific page#

  • Possible intent trigger: the chatbot recognises the command ‘display page_X’

  • Possible published intent:

data: {"object": "page_X",
       "recipient": "anonymous_person_e4da"}
source: "anonymous_person_e4da"
priority: 128
confidence: 0.4

User asks the robot to go somewhere#

  • Possible intent trigger: the chatbot recognises the command ‘take me to place_X’

  • Possible published intent:

intent: GUIDE
data: {"goal": "place_X",
       "recipient": "person_55dc"}
source: "person_55dc"
priority: 128
confidence: 0.8

User wants to cancel a task#

  • Possible intent trigger: the user presses a ‘cancel’ button on the touchscreen

  • Possible published intent:

data: {"object": "<current activity>"}
priority: 255
confidence: 1.0

Supervisor sends command for the robot to dock via tablet#

  • Possible intent trigger: a button press on a remote control tablet

  • Possible published intent:

intent: MOVE_TO
data: {"goal": "poi_docking"}
priority: 255
confidence: 1.0

How are intents used by the robot?#

Intents published on the /intents topic represent each of the user’s desires or commands understood by the robot.

These intents need to be acted upon by a dedicated node (or a group of nodes, depending on the architecture design) that is called the robot’s application controller. The general role of the application controller is to schedule and run the different capabilities based on received intents, and allocate robot’s resources (to ensure no two actions are using eg the arms or navigation, at the same time).

You can use any supervision technique to implement your own application controller: simple python scripts, finite state machines, behaviour trees, symbolic task planner: the ARI SDK does not enforce any particular paradigm.

You can learn more about how to program applications for the robot here: ARI app development.

To get started, the basic-interaction tutorial explains how to create your own simple Python controller.

ARI comes with a default application controller (the one underpinning the landing demo page, see First start-up) that reacts to different types of intents. You can have a look at its source code and use it as a reference.

When not to use intents?#

There are two interaction situations where intents should not be used: interactions with no side-effect on the robot (instead, use directly the chatbot actions), and short confirmation

Interactions with no side-effects#

User interactions do not always have to generate intents. In particular, during a chatbot interaction, the chatbot engine might need perform simple actions to answer the user’s questions which do not impact the robot state. In this case, it is unnecessary to generate an intent, as no complex action scheduling is necessary.

For instance, if the user asks the robot about the weather, the chatbot can generate an answer by querying an online weather forecast API. This does not require any specific robot resources. Similarly, checking the battery level of the robot has no impact on the robot state or resources.

In these cases, instead of publishing an intent, the chatbot engine can directly perform the API requests or ROS service calls to answer the user’s question.


Dialogue management explains how to create and customise your own chatbots. You can also specifically refer to triggering-actions-from-rasa.

See also#

  • For a general introduction to application development for the robot, check ARI app development.

  • basic-interaction explains how to subscribe to intents and create your own simple Python controller.