Tutorial: detect people around the robot (C++)#

🏁 Goal of this tutorial

Identify which persons are oriented toward the robot using libhri and its API.

This is useful when you want your robot to be socially proactive and try to attract those people that might be looking at it from a distance, unsure whether getting closer to the robot or not.

Pre-requisites#

This tutorial requires you to be familiar with the concepts of reference frames, 3D rotation transformations and 3D translation transformations.
This tutorial assumes that you already have an up and running body detection pipeline, like the one running on PAL’s robots.
This tutorial also assumes that you have installed libhri. If you have not, apt install ros-noetic-hri or refer to the official repo to install from source.

Note

PAL’s robots already comes with hri_fullbody and libhri installed.

If you want to configure a similar pipeline on your machine, you can install and use hri_fullbody.

The code#

One possible implementation of a node detecting the bodies oriented toward the robot is the following:

body_orientation_listener.h#

#include <hri/hri.h>
#include <hri/body.h>
#include <ros/ros.h>
#include <string>

class BodyOrientationListener{

   // HRIListener object to handle information regarding
   // the humans perceived by the robot
   hri::HRIListener hri_listener_;

   // attention cone semi-amplitude
   double threshold_;

   // vector storing the ids of the bodies detected as oriented
   // toward the robot
   std::vector<std::string> bodies_facing_robot_;

 public:

   BodyOrientationListener(const std::string& base_frame, const double& threshold);
   ~BodyOrientationListener();

   /** this class, through the run method,
   * detects which bodies are oriented toward the robot,
   * expressed as the base frame. Every second, it
   * prints the detected bodies oriented toward the robot.
   */
   void run();
};

body_orientation_listener.cpp#

#include <iostream>
#include <cmath>

#include <ros/ros.h>

#include "body_orientation_listener.h"
#include <ros/ros.h>
#include <geometry_msgs/TransformStamped.h>

#include <tf2_geometry_msgs/tf2_geometry_msgs.h>

BodyOrientationListener::BodyOrientationListener(
  const std::string& base_frame, const double& threshold){
  // threshold, or attention cone semi-amplitude,
  // expressed in radians
  threshold_ = threshold/180*M_PI;

  // setting the reference frame for the
  // HRIListener object hri_listener_
  hri_listener_.setReferenceFrame(base_frame);
}

BodyOrientationListener::~BodyOrientationListener(){
}

void BodyOrientationListener::run(){

  ros::Rate rate(1); // sec

  while(ros::ok()){

    ros::spinOnce();

    // a std::map representing the bodies
    // detected through the body detection pipeline.
    // For each pair in the map, the first objects
    // is the body id, while the second
    // represents information regarding the body
    // in the form of a weak pointer to a hri::Body
    // object
    auto bodies = hri_listener_.getBodies();

    bodies_facing_robot_.clear();

    for (auto& body: bodies){
      if (auto body_ptr = body.second.lock()){
        // Checking whether the transform from
        // the current body is actually available
        if (auto bodyTransform = body_ptr->transform()){

          // tf2::Transform objects declaration
          // for both the base frame to body frame
          // transformation and its inverse.
          // r2b = robot to body
          // b2r = body to robot
          tf2::Transform r2b_transform, b2r_transform;

          // transform, obtained from the hri::Body object
          // representing the body managed during this iteration,
          // is actually a geometry_msgs::TransformStamped object.
          // Here, the geometry_msgs::Transform object is extracted
          // and then transformed into a tf2::Transform object, that is
          // r2b_transform
          geometry_msgs::Transform r2bGM_transform = bodyTransform->transform;
          tf2::fromMsg(r2bGM_transform, r2b_transform);

          // To understand if a body is oriented toward the robot
          // the body frame to base frame
          // (that is, the robot frame) transformation is required.
          // Since the hri API provides the base frame to
          // body frame transformation, it requires to be inverted
          b2r_transform = r2b_transform.inverse();

          // to understand whether a body is oriented toward
          // the robot or not, the only required information is
          // the coordinates of the origin of the robot frame
          // expressed in body frame. This information
          // is the translation component of the body frame
          // to base frame transformation
          tf2::Vector3 translation = b2r_transform.getOrigin();

          // The length of the projection of the translation
          // vector on the XY plane of the body frame
          double translationXY_length =
            std::sqrt(std::pow(translation.x(), 2)+std::pow(translation.y(), 2));

          // The orientation of the body with respect to
          // the robot can be expressed as the angle
          // between the x axis of the body frame
          // and the projection on the XY plane of the
          // body frame to base frame translation vector
          // in body frame coordinates
          double body_orientation = std::acos(translation.x()/translationXY_length);

          // This if statement checks that:
          //  - the robot is actually in front of the person,
          //  not behind
          //  - the body angle previously computed is smaller
          //  than the initially set threshold
          // If this is the case, then the body id gets added
          // to the list of those oriented toward the robot
          if ((translation.x() > 0) && (body_orientation < threshold_))
            bodies_facing_robot_.push_back(body.first);
        }
      }
    }

    for(auto& body: bodies_facing_robot_)
      ROS_INFO_STREAM(body << " oriented toward the robot");

    rate.sleep();
  }
}

int main(int argc, char** argv){

  ros::init(argc, argv, "body_orientation_listener");

  // Defining the values for the Body Orientation
  // Listener initialisation. In this case, the
  // reference frame for the robot, or base frame,
  // is "camera_link", and the attention
  // cone semi-amplitude for each body
  // is 30 degrees
  std::string base_frame = "camera_link";
  double threshold = 30;

  BodyOrientationListener bol(base_frame, threshold);

  bol.run();

  return 0;
}

The code explained#

body_orientation_listener.h#

class BodyOrientationListener{
  hri::HRIListener hri_listener_;

  double threshold_;

  std::vector<std::string> bodies_facing_robot_;

public:

  BodyOrientationListener(const std::string& base_frame, const double& threshold);
  ~BodyOrientationListener();

  void run();
};

This is the declaration of the BodyOrientationListener class. Here, hri_listener_ is a HRIListener object. HRIListener abstracts some of the ROS4HRI aspects: for instance, it manages the callbacks reading the lists of detected bodies, faces, voices and persons. This way, you don’t have to define the same callbacks over and over again in different ROS nodes. Later, you will discover more about the HRIListener and libhri API capabilities.

There are two more private objects declared:

threshold_ defines the semi-amplitude of the interaction cone, i.e. the cone with origin in the body frame within which the base frame can be to consider the body oriented toward the robot.
bodies_facing_robot_ will contain the ids of the bodies oriented toward the robot.

There are three public functions declared:

BodyOrientationListener(const std::string& base_frame, const double& threshold) is the class constructor.
~BodyOrientationListener() is the class destructor.
void run() will implement the whole evaluation process for the bodies orientation.

The definition of the class members happens in body_orientation_listener.cpp.

body_orientation_listener.cpp#

BodyOrientationListener::BodyOrientationListener(
  const std::string& base_frame, const double& threshold){
  // threshold, or attention cone semi-amplitude,
  // expressed in radians
  threshold_ = threshold/180*M_PI;

  // setting the reference frame for the
  // HRIListener object hri_listener_void BodyOrientationListener::run()

  auto bodies = hri_listener_.getBodies();
  hri_listener_.setReferenceFrame(base_frame);
}

In the constructor, the threshold_ value is set to threshold, one of the constructor arguments, and converted in radians, as the trigonometric functions from the cmath library use radians. Additionally, hri_listener_ reference frame is set through the setReferenceFrame method using base_frame, one of the constructor arguments.

The whole evaluation process regarding bodies orientation takes place in the void BodyOrientationListener::run() method.

void BodyOrientationListener::run()#

ros::Rate rate(1);

Here, we set the rate for the evaluation process execution. In this case, it will be run once per second.

void BodyOrientationListener::run()#

while(ros::ok()){

Once set the process rate, we can start a loop only stopping once the node receives the shutdown signal. ros::ok() returns the OK flag value, which is true until the node receives the shutdown signal (and properly manages it).

void BodyOrientationListener::run()#

ros::spinOnce();

As we will see later, the run method evaluates the body orientations and then sleeps for one second. However, during that second ROS messages keep being published by the other nodes in the network. To update the class members about the latest ROS messages that were published while the node was sleeping, we run the ros::spinOnce() method.

void BodyOrientationListener::run()#

auto bodies = hri_listener_.getBodies();

Once spinned, hri_listener_ has updated information regarding the bodies. We can retrieve this information, that is, a std::map object containing pairs where the first element is the body id (std::string) and the second element is a pointer to an object describing some body-related aspects (hri::Body). In this case, you can see how the libhri API simplifies accessing information regarding the bodies detected by the robot (i.e., its body detection pipeline). The abstraction level introduced by the HRIListener class avoids direct interaction between this code and the ROS4HRI topics, and it is possible to directly obtain objects containing information and utility functions to easily develop HRI applications.

void BodyOrientationListener::run()#

bodies_facing_robot_.clear();

Before starting the geometric operations aimed to establish the bodies oriented toward the robot, we remove all the body ids detected during the previous loop from the bodies_facing_robot_ vector.

void BodyOrientationListener::run()#

for (auto& body: bodies){
  if (auto body_ptr = body.second.lock()){

We can now start iterating over the bodies that the robot is currently aware of, that is those contained in the bodies object. Since the pointer to the Body object is a std::weak_ptr, then we have to assume (temporary) ownership before accessing it. The lock() function allows us to assume ownership, returning a std::shared_ptr object.

void BodyOrientationListener::run()#

if (auto bodyTransform = body_ptr->transform()){

One additional check is the one regarding the existence of the transform. Dealing with tf can be complicated some times, and even if we are expecting a transformation to exist, it is not always the case. To manage this situation, the Body::transform function returns a boost::optional object, that is the returned transform might be empty as well. The if statement ensures that the transform from the base frame to the body frame actually exists.

This is another example of how the libhri API eases the coding experience. You could have expected needing tf objects to handle transformations. Instead, libhri manages the tf aspects for body frames and directly returns the transformation between the base frame and the body frame.

void BodyOrientationListener::run()#

tf2::Transform r2b_transform, b2r_transform;

Now it’s time to declare the objects we are going to use to represent the transforms involved in the evaluation process. Here, r2b stands for robot to body, while b2r stands for body to robot.

void BodyOrientationListener::run()#

 geometry_msgs::Transform r2bGM_transform = bodyTransform->transform;
 tf2::fromMsg(r2bGM_transform, r2b_transform);

bodyTransform is a stamped message, i.e. it contains information regarding the geometric transformation time, frame and child frame. This information is contained in bodyTransform.header. We are only interested in bodyTransform.transform, which contains the geometric information regarding the transformation and can easily be transformed into a tf::Transform object and, from there, inverted.

void BodyOrientationListener::run()#

b2r_transform = r2b_transform.inverse();

Now, the idea is to start working from a body perspective, not a robot one (i.e., inverting the previously obtained geometric transformation). In fact, to understand whether a person is facing a robot or not, you are not interested in the position of the person with respect to the robot, but more in the position of the robot with respect to the body frame. If the body is oriented toward the robot, with a body frame following the REP 155 frame definition, then the robot frame (i.e., the base frame) origin expressed in body frame coordinates will have a positive \(x\) component and a relatively small \(y\) component.

../_images/body_toward_robot.svg — Fig 1. Human oriented toward the robot.#

../_images/body_not_toward_robot.svg — Fig 2. Human not oriented toward the robot.#

In Fig. 1, you can see how the \(d_b2r\) vector, expressed in body frame coordinates, has a relatively small \(y\) component when compared to the \(x\) one. Differently, in Fig 2. you can see that \(d_b2r\) component has a greater \(y\) component in body frame coordinates, suggesting that the human is not oriented toward the robot.

void BodyOrientationListener::run()#

tf2::Vector3 translation = b2r_transform.getOrigin();

As we showed before, once inverted the original transform, the only information required to understand whether a body is oriented toward a robot or not is the translation vector. Here, we extract it.

void BodyOrientationListener::run()#

double translationXY_length =
  std::sqrt(std::pow(translation.x(), 2)+std::pow(translation.y(), 2));

In order to establish how much the \(x\) component contributes to the translation vector, we can compare these two’s length. However, we are not interested in the \(z\) component of the body frame to base frame transformation, as this is not providing any information regarding the body orientation (in a scenario where humans are assumed to be always standing). For this reason, we compute the length of the projection of the translation vector on the \(xy\) plane.

void BodyOrientationListener::run()#

double body_orientation = std::acos(translation.x()/translationXY_length);

Here, we compute the angle between the \(d_b2r\) \(xy\) plane projection and the :x: axis. Now, this needs to be compared with threshold_ to establish whether the body is oriented toward the robot or not.

void BodyOrientationListener::run()#

if ((translation.x() > 0) && (body_orientation < threshold_))
  bodies_facing_robot_.push_back(body.first);

This if statement filters the bodies based on their orientation, where the filtering parameter is threshold_. One additional check is required on the \(x\) component, to verify that the robot is actually in front of the body. Once passed this two checks, the body id is inserted in the bodies_facing_robot_ vector.

void BodyOrientationListener::run()#

for(auto& body: bodies_facing_robot_)
  ROS_INFO_STREAM(body << " oriented toward the robot");

These lines iterate over the bodies_facing_robot_ elements, printing them. This is a basic instruction, and we invite you to extend or completely substitute these lines to integrate code that actually makes use of the extracted information.

void BodyOrientationListener::run()#

rate.sleep();

Finally, the process is stopped for 1 second, according to the previously defined rate.

Then, you can find the node and BodyOrientationListener object initialization in the main function.

int main(int argc, char** argv)#

ros::init(argc, argv, "body_orientation_listener");

This is the node initialization command, where we set the name and pass the command line arguments as received by the main function itself. In this case, there are no command line arguments expected.

int main(int argc, char** argv)#

std::string base_frame = "camera_link";
double threshold = 30;

The two arguments required by BodyOrientationListener are initialized. In this case, we set the threshold to 30 degrees. You can play with this value and change it according to your application. The base_frame is set to a generic camera_link. The frame needs to be part of your robot’s tf frames tree.

Finally, a BodyOrientationListener object is initialized and started.

int main(int argc, char** argv)#

// BodyOrientationListener object initialisation
BodyOrientationListener bol(base_frame, threshold);

// Starting the detection process
bol.run();

Next steps#

The node you have developed through this tutorial does not really use the information about the bodies oriented toward the robot. You might think about a cool behaviour exploiting this information and implement it!
You might argue that face and gaze orientations could tell us more about a person’s engagement with the robot… and you would be right! Check this tutorial for a possible implementation of an engagement detection node based on face and gaze orientation.
If you’re interested in the python version of this tutorial, check the pyhri tutorial.