Getting started with ARI - Combine voice, gesture and eyes#

🏁 Goal of this tutorial

By the end of this tutorial, you will be able to create a Python script that combines text-to-speech, gestures and eye expressions of the robot in a synchronised way to enhance the interactions with it.

The TTS module allows us to integrate annotated text in the utterances to be spoken by the robot. These tags can be used to synchronise additional actions of the robot. More specifically, we can include:

  • gestures

  • eyes expressions

To do so, we use the following syntax:

<mark name='ACTION_TAG'/>

where, ACTION_TAG can be one of the following types:

  • gestures: doTrick trickName=MOTION_ID

  • expressive eyes: eyes(EYES_EXPRESSION)

Let’s have a look at an example that combines these features.

The code#

getting_started_expressive_tts.py#
 1#! /usr/bin/python
 2# -*- coding: utf-8 -*-
 3
 4import rospy
 5from actionlib import SimpleActionClient
 6from pal_interaction_msgs.msg import TtsAction, TtsGoal
 7
 8rospy.init_node("expressive_tts_demo")
 9tts_client = SimpleActionClient('/tts', TtsAction)
10tts_client.wait_for_server()
11
12goal = TtsGoal()
13goal.rawtext.text = "<mark name='eyes(happy)'/> Hello, <mark name='doTrick trickName=bow'/>" \
14                    "I'm ARI, nice to meet you!"
15goal.rawtext.lang_id = "en_GB"
16
17rospy.loginfo("sending goal")
18tts_client.send_goal_and_wait(goal)
19rospy.loginfo("done")

The code explained#

First we need to import the necessary packages and messages. In this case, we import the SimpleActionClient to define the TTS client and its corresponding messages.

5from actionlib import SimpleActionClient
6from pal_interaction_msgs.msg import TtsAction, TtsGoal

We next initialize the ROS node, create a TTS action client SimpleActionClient and wait until the server responds.

 8rospy.init_node("expressive_tts_demo")
 9tts_client = SimpleActionClient('/tts', TtsAction)
10tts_client.wait_for_server()

Once the client is set, we can send marked text to the TTS server. In the provided example, we use two different tags to set the desired eye expression and to execute a waving gesture at the same time:

12goal = TtsGoal()
13goal.rawtext.text = "<mark name='eyes(HAPPY)'/> Hello, <mark name='doTrick trickName=bow'/>" \
14                    "I'm ARI, nice to meet you!"
15goal.rawtext.lang_id = "en_GB"
16
17tts_client.send_goal_and_wait(goal)

Next steps#