Tutorial: detect people around the robot (C++)#

🏁 Goal of this tutorial

Identify which persons are oriented toward the robot using libhri and its API.

This is useful when you want your robot to be socially proactive and try to attract those people that might be looking at it from a distance, unsure whether getting closer to the robot or not.


  • This tutorial requires you to be familiar with the concepts of reference frames, 3D rotation transformations and 3D translation transformations.

  • This tutorial assumes that you already have an up and running body detection pipeline, like the one running on ARI.

  • This tutorial also assumes that you have installed libhri. If you have not, apt install ros-noetic-hri or refer to the official repo to install from source.


ARI already comes with hri_fullbody and libhri installed.

If you want to configure a similar pipeline on your machine, you can install and use hri_fullbody.

The code#

One possible implementation of a node detecting the bodies oriented toward the robot is the following:

 1#include <hri/hri.h>
 2#include <hri/body.h>
 3#include <ros/ros.h>
 4#include <string>
 6class BodyOrientationListener{
 8   // HRIListener object to handle information regarding
 9   // the humans perceived by the robot
10   hri::HRIListener hri_listener_;
12   // attention cone semi-amplitude
13   double threshold_;
15   // vector storing the ids of the bodies detected as oriented
16   // toward the robot
17   std::vector<std::string> bodies_facing_robot_;
19 public:
21   BodyOrientationListener(const std::string& base_frame, const double& threshold);
22   ~BodyOrientationListener();
24   /** this class, through the run method,
25   * detects which bodies are oriented toward the robot,
26   * expressed as the base frame. Every second, it
27   * prints the detected bodies oriented toward the robot.
28   */
29   void run();
  1#include <iostream>
  2#include <cmath>
  4#include <ros/ros.h>
  6#include "body_orientation_listener.h"
  7#include <ros/ros.h>
  8#include <geometry_msgs/TransformStamped.h>
 10#include <tf2_geometry_msgs/tf2_geometry_msgs.h>
 13  const std::string& base_frame, const double& threshold){
 14  // threshold, or attention cone semi-amplitude,
 15  // expressed in radians
 16  threshold_ = threshold/180*M_PI;
 18  // setting the reference frame for the
 19  // HRIListener object hri_listener_
 20  hri_listener_.setReferenceFrame(base_frame);
 26void BodyOrientationListener::run(){
 28  ros::Rate rate(1); // sec
 30  while(ros::ok()){
 32    ros::spinOnce();
 34    // a std::map representing the bodies
 35    // detected through the body detection pipeline.
 36    // For each pair in the map, the first objects
 37    // is the body id, while the second
 38    // represents information regarding the body
 39    // in the form of a weak pointer to a hri::Body
 40    // object
 41    auto bodies = hri_listener_.getBodies();
 43    bodies_facing_robot_.clear();
 45    for (auto& body: bodies){
 46      if (auto body_ptr = body.second.lock()){
 47        // Checking whether the transform from
 48        // the current body is actually available
 49        if (auto bodyTransform = body_ptr->transform()){
 51          // tf2::Transform objects declaration
 52          // for both the base frame to body frame
 53          // transformation and its inverse.
 54          // r2b = robot to body
 55          // b2r = body to robot
 56          tf2::Transform r2b_transform, b2r_transform;
 58          // transform, obtained from the hri::Body object
 59          // representing the body managed during this iteration,
 60          // is actually a geometry_msgs::TransformStamped object.
 61          // Here, the geometry_msgs::Transform object is extracted
 62          // and then transformed into a tf2::Transform object, that is
 63          // r2b_transform
 64          geometry_msgs::Transform r2bGM_transform = bodyTransform->transform;
 65          tf2::fromMsg(r2bGM_transform, r2b_transform);
 67          // To understand if a body is oriented toward the robot
 68          // the body frame to base frame
 69          // (that is, the robot frame) transformation is required.
 70          // Since the hri API provides the base frame to
 71          // body frame transformation, it requires to be inverted
 72          b2r_transform = r2b_transform.inverse();
 74          // to understand whether a body is oriented toward
 75          // the robot or not, the only required information is
 76          // the coordinates of the origin of the robot frame
 77          // expressed in body frame. This information
 78          // is the translation component of the body frame
 79          // to base frame transformation
 80          tf2::Vector3 translation = b2r_transform.getOrigin();
 82          // The length of the projection of the translation
 83          // vector on the XY plane of the body frame
 84          double translationXY_length =
 85            std::sqrt(std::pow(translation.x(), 2)+std::pow(translation.y(), 2));
 87          // The orientation of the body with respect to
 88          // the robot can be expressed as the angle
 89          // between the x axis of the body frame
 90          // and the projection on the XY plane of the
 91          // body frame to base frame translation vector
 92          // in body frame coordinates
 93          double body_orientation = std::acos(translation.x()/translationXY_length);
 95          // This if statement checks that:
 96          //  - the robot is actually in front of the person,
 97          //  not behind
 98          //  - the body angle previously computed is smaller
 99          //  than the initially set threshold
100          // If this is the case, then the body id gets added
101          // to the list of those oriented toward the robot
102          if ((translation.x() > 0) && (body_orientation < threshold_))
103            bodies_facing_robot_.push_back(body.first);
104        }
105      }
106    }
108    for(auto& body: bodies_facing_robot_)
109      ROS_INFO_STREAM(body << " oriented toward the robot");
111    rate.sleep();
112  }
115int main(int argc, char** argv){
117  ros::init(argc, argv, "body_orientation_listener");
119  // Defining the values for the Body Orientation
120  // Listener initialisation. In this case, the
121  // reference frame for the robot, or base frame,
122  // is "camera_link", and the attention
123  // cone semi-amplitude for each body
124  // is 30 degrees
125  std::string base_frame = "camera_link";
126  double threshold = 30;
128  BodyOrientationListener bol(base_frame, threshold);
130  bol.run();
132  return 0;

The code explained#

 6class BodyOrientationListener{
 7  hri::HRIListener hri_listener_;
 9  double threshold_;
11  std::vector<std::string> bodies_facing_robot_;
15  BodyOrientationListener(const std::string& base_frame, const double& threshold);
16  ~BodyOrientationListener();
18  void run();

This is the declaration of the BodyOrientationListener class. Here, hri_listener_ is a HRIListener object. HRIListener abstracts some of the ROS4HRI aspects: for instance, it manages the callbacks reading the lists of detected bodies, faces, voices and persons. This way, you don’t have to define the same callbacks over and over again in different ROS nodes. Later, you will discover more about the HRIListener and libhri API capabilities.

There are two more private objects declared:

  • threshold_ defines the semi-amplitude of the interaction cone, i.e. the cone with origin in the body frame within which the base frame can be to consider the body oriented toward the robot.

  • bodies_facing_robot_ will contain the ids of the bodies oriented toward the robot.

There are three public functions declared:

  • BodyOrientationListener(const std::string& base_frame, const double& threshold) is the class constructor.

  • ~BodyOrientationListener() is the class destructor.

  • void run() will implement the whole evaluation process for the bodies orientation.

The definition of the class members happens in body_orientation_listener.cpp.

13  const std::string& base_frame, const double& threshold){
14  // threshold, or attention cone semi-amplitude,
15  // expressed in radians
16  threshold_ = threshold/180*M_PI;
18  // setting the reference frame for the
19  // HRIListener object hri_listener_void BodyOrientationListener::run()
21  auto bodies = hri_listener_.getBodies();
22  hri_listener_.setReferenceFrame(base_frame);

In the constructor, the threshold_ value is set to threshold, one of the constructor arguments, and converted in radians, as the trigonometric functions from the cmath library use radians. Additionally, hri_listener_ reference frame is set through the setReferenceFrame method using base_frame, one of the constructor arguments.

The whole evaluation process regarding bodies orientation takes place in the void BodyOrientationListener::run() method.

void BodyOrientationListener::run()#
29ros::Rate rate(1);

Here, we set the rate for the evaluation process execution. In this case, it will be run once per second.

void BodyOrientationListener::run()#

Once set the process rate, we can start a loop only stopping once the node receives the shutdown signal. ros::ok() returns the OK flag value, which is true until the node receives the shutdown signal (and properly manages it).

void BodyOrientationListener::run()#

As we will see later, the run method evaluates the body orientations and then sleeps for one second. However, during that second ROS messages keep being published by the other nodes in the network. To update the class members about the latest ROS messages that were published while the node was sleeping, we run the ros::spinOnce() method.

void BodyOrientationListener::run()#
47auto bodies = hri_listener_.getBodies();

Once spinned, hri_listener_ has updated information regarding the bodies. We can retrieve this information, that is, a std::map object containing pairs where the first element is the body id (std::string) and the second element is a pointer to an object describing some body-related aspects (hri::Body). In this case, you can see how the libhri API simplifies accessing information regarding the bodies detected by the robot (i.e., its body detection pipeline). The abstraction level introduced by the HRIListener class avoids direct interaction between this code and the ROS4HRI topics, and it is possible to directly obtain objects containing information and utility functions to easily develop HRI applications.

void BodyOrientationListener::run()#

Before starting the geometric operations aimed to establish the bodies oriented toward the robot, we remove all the body ids detected during the previous loop from the bodies_facing_robot_ vector.

void BodyOrientationListener::run()#
57for (auto& body: bodies){
58  if (auto body_ptr = body.second.lock()){

We can now start iterating over the bodies that the robot is currently aware of, that is those contained in the bodies object. Since the pointer to the Body object is a std::weak_ptr, then we have to assume (temporary) ownership before accessing it. The lock() function allows us to assume ownership, returning a std::shared_ptr object.

void BodyOrientationListener::run()#
61if (auto bodyTransform = body_ptr->transform()){

One additional check is the one regarding the existence of the transform. Dealing with tf can be complicated some times, and even if we are expecting a transformation to exist, it is not always the case. To manage this situation, the Body::transform function returns a boost::optional object, that is the returned transform might be empty as well. The if statement ensures that the transform from the base frame to the body frame actually exists.

This is another example of how the libhri API eases the coding experience. You could have expected needing tf objects to handle transformations. Instead, libhri manages the tf aspects for body frames and directly returns the transformation between the base frame and the body frame.

void BodyOrientationListener::run()#
67tf2::Transform r2b_transform, b2r_transform;

Now it’s time to declare the objects we are going to use to represent the transforms involved in the evaluation process. Here, r2b stands for robot to body, while b2r stands for body to robot.

void BodyOrientationListener::run()#
75 geometry_msgs::Transform r2bGM_transform = bodyTransform->transform;
76 tf2::fromMsg(r2bGM_transform, r2b_transform);

bodyTransform is a stamped message, i.e. it contains information regarding the geometric transformation time, frame and child frame. This information is contained in bodyTransform.header. We are only interested in bodyTransform.transform, which contains the geometric information regarding the transformation and can easily be transformed into a tf::Transform object and, from there, inverted.

void BodyOrientationListener::run()#
83b2r_transform = r2b_transform.inverse();

Now, the idea is to start working from a body perspective, not a robot one (i.e., inverting the previously obtained geometric transformation). In fact, to understand whether a person is facing a robot or not, you are not interested in the position of the person with respect to the robot, but more in the position of the robot with respect to the body frame. If the body is oriented toward the robot, with a body frame following the REP 155 frame definition, then the robot frame (i.e., the base frame) origin expressed in body frame coordinates will have a positive \(x\) component and a relatively small \(y\) component.


Fig 1. Human oriented toward the robot.#


Fig 2. Human not oriented toward the robot.#

In Fig. 1, you can see how the \(d_b2r\) vector, expressed in body frame coordinates, has a relatively small \(y\) component when compared to the \(x\) one. Differently, in Fig 2. you can see that \(d_b2r\) component has a greater \(y\) component in body frame coordinates, suggesting that the human is not oriented toward the robot.

void BodyOrientationListener::run()#
91tf2::Vector3 translation = b2r_transform.getOrigin();

As we showed before, once inverted the original transform, the only information required to understand whether a body is oriented toward a robot or not is the translation vector. Here, we extract it.

void BodyOrientationListener::run()#
95double translationXY_length =
96  std::sqrt(std::pow(translation.x(), 2)+std::pow(translation.y(), 2));

In order to establish how much the \(x\) component contributes to the translation vector, we can compare these two’s length. However, we are not interested in the \(z\) component of the body frame to base frame transformation, as this is not providing any information regarding the body orientation (in a scenario where humans are assumed to be always standing). For this reason, we compute the length of the projection of the translation vector on the \(xy\) plane.

void BodyOrientationListener::run()#
104double body_orientation = std::acos(translation.x()/translationXY_length);

Here, we compute the angle between the \(d_b2r\) \(xy\) plane projection and the :x: axis. Now, this needs to be compared with threshold_ to establish whether the body is oriented toward the robot or not.

void BodyOrientationListener::run()#
113if ((translation.x() > 0) && (body_orientation < threshold_))
114  bodies_facing_robot_.push_back(body.first);

This if statement filters the bodies based on their orientation, where the filtering parameter is threshold_. One additional check is required on the \(x\) component, to verify that the robot is actually in front of the body. Once passed this two checks, the body id is inserted in the bodies_facing_robot_ vector.

void BodyOrientationListener::run()#
120for(auto& body: bodies_facing_robot_)
121  ROS_INFO_STREAM(body << " oriented toward the robot");

These lines iterate over the bodies_facing_robot_ elements, printing them. This is a basic instruction, and we invite you to extend or completely substitute these lines to integrate code that actually makes use of the extracted information.

void BodyOrientationListener::run()#

Finally, the process is stopped for 1 second, according to the previously defined rate.

Then, you can find the node and BodyOrientationListener object initialization in the main function.

int main(int argc, char** argv)#
133ros::init(argc, argv, "body_orientation_listener");

This is the node initialization command, where we set the name and pass the command line arguments as received by the main function itself. In this case, there are no command line arguments expected.

int main(int argc, char** argv)#
141std::string base_frame = "camera_link";
142double threshold = 30;

The two arguments required by BodyOrientationListener are initialized. In this case, we set the threshold to 30 degrees. You can play with this value and change it according to your application. The base_frame is set to a generic camera_link. The frame needs to be part of your robot’s tf frames tree.

Finally, a BodyOrientationListener object is initialized and started.

int main(int argc, char** argv)#
144// BodyOrientationListener object initialisation
145BodyOrientationListener bol(base_frame, threshold);
147// Starting the detection process

Next steps#

  • The node you have developed through this tutorial does not really use the information about the bodies oriented toward the robot. You might think about a cool behaviour exploiting this information and implement it!

  • You might argue that face and gaze orientations could tell us more about a person’s engagement with the robot… and you would be right! Check this tutorial for a possible implementation of an engagement detection node based on face and gaze orientation.

  • If you’re interested in the python version of this tutorial, check the pyhri tutorial.