Tutorial: Detect people around the robot (C++)#

🏁 Goal of this tutorial

Identify which persons are oriented toward the robot using libhri and its API.

This is useful when you want your robot to be socially proactive and try to attract those people that might be looking at it from a distance, unsure whether getting closer to the robot or not.

Pre-requisites#

This tutorial requires you to be familiar with the concepts of reference frames, 3D rotation transformations and 3D translation transformations.
This tutorial assumes that you already have an up and running body detection pipeline. The pipeline must be ROS4HRI compatible. If you do not have one available, you can start using hri_fullbody.
This tutorial also assumes that you have installed libhri. If you do not have it yet, please refer to the official repo.

Note

Your robot already comes with hri_fullbody and libhri installed.

The code#

We next provide an example of a node that detects bodies oriented towards the robot.

body_orientation_listener.hpp#

  #include <vector>
  #include <string>

  #include <rclcpp/rclcpp.hpp>

  #include "hri/hri.hpp"

  // This class detects and prints which bodies are oriented towards
  // the robot, expressed as the base frame.
  class BodyOrientationListener : public rclcpp::Node
  {

     // HRIListener object to handle information regarding
     // the humans perceived by the robot
      std::shared_ptr<hri::HRIListener> hri_listener_;

     // timer to trigger the detections
     rclcpp::TimerBase::SharedPtr timer_;

     // attention cone semi-amplitude
     double threshold_;

     // vector storing the ids of the bodies detected as oriented
     // toward the robot
     std::vector<std::string> bodies_facing_robot_;

   public:

     BodyOrientationListener();
     ~BodyOrientationListener();

     // Function initialising the members
     void init(const std::string & base_frame, const double & threshold);

     // Function performing the core operations
     //   - body ids retrieval
     //   - body pose retrieval
     //   - computation of the bodies oriented toward the robot
     void run();
  };

body_orientation_listener.cpp#

  #include <iostream>
  #include <cmath>

  #include "geometry_msgs/msg/transform_stamped.hpp"
  #include "tf2_geometry_msgs/tf2_geometry_msgs.hpp"

  #include "person_facing_robot/person_facing_robot.hpp"

  BodyOrientationListener::BodyOrientationListener():
  rclcpp::Node("body_orientation_listener")
  {}

  BodyOrientationListener::~BodyOrientationListener()
  {}

  void BodyOrientationListener::init(const std::string & base_frame, const double & threshold)
  {
    hri_listener_ = hri::HRIListener::create(shared_from_this());
    timer_ = create_wall_timer(std::chrono::milliseconds(1000), std::bind(&BodyOrientationListener::run, this));
    threshold_ = threshold/180*M_PI;
    hri_listener_->setReferenceFrame(base_frame);
  }

  void BodyOrientationListener::run()
  {
    // a std::map representing the bodies ControlHead
    // detected through the body detection pipeline.
    // For each pair in the map, the first element corresponds
    // to the body id, while the second one, information regarding
    // the body in the form of a weak pointer to a hri::Body object.
    auto bodies = hri_listener_->getBodies();
    bodies_facing_robot_.clear();

    for (auto const & [body_id, body]: bodies)
    {
      std::cout << "body detected = " << body_id << std::endl;

      // Checking whether the transform from the current body is actually
      // available
      if (auto bodyTransform = body->transform())
      {
        // geometry_msgs::Transform, obtained from the hri::Body object
        // representing the body managed during this iteration.
        geometry_msgs::msg::Transform r2bGM_transform = bodyTransform->transform;

        // tf2::Transform objects declaration for both, the base frame to body
        // frame transformation and its inverse:
        // r2b = robot to body
        // b2r = body to robot
        tf2::Transform r2b_transform, b2r_transform;

        // Here, the geometry_msgs::Transform object is extracted
        // and then transformed into a tf2::Transform object
        tf2::fromMsg(r2bGM_transform, r2b_transform);

        // We need the body frame to base frame (i.e., the robot frame)
        // transformation.
        // Given that the hri API provides the base frame to
        // body frame transformation, we invert it.
        b2r_transform = r2b_transform.inverse();

        // To compute whether a body is oriented towards the robot, we use
        // the origin coordinates of the robot frame expressed in body frame.
        // This information corresponds to the translation component of the
        // body frame to base frame transformation.
        tf2::Vector3 translation = b2r_transform.getOrigin();

        // The length of the projection of the translation
        // vector on the XY plane of the body frame
        double translationXY_length =
            std::sqrt(std::pow(translation.x(), 2)
                      + std::pow(translation.y(), 2));

        // The orientation of the body with respect to the robot can be
        // expressed as the angle between the x axis of the body frame
        // and the projection on the XY plane of the body frame to base
        // frame translation vector in body frame coordinates
        double body_orientation =
            std::acos(translation.x() / translationXY_length);

        // If the robot is in front of the person (not behind) and angle is
        // below a given thershold, then we consider that the person is facing
        // the robot
        if ((translation.x() > 0) && (body_orientation < threshold_))
            bodies_facing_robot_.push_back(body_id);
      }
    }

    for (auto& body: bodies_facing_robot_)
      std::cout << body << " oriented towards the robot\n";
  }

  int main(int argc, char * argv[])
  {
    rclcpp::init(argc, argv);
    auto node = std::make_shared<BodyOrientationListener>();
    node->init("default_cam", 20);
    std::cout << "Node initialised!\n";
    rclcpp::spin(node->get_node_base_interface());
    rclcpp::shutdown();
    return 0;
  }

The code explained#

body_orientation_listener.hpp#

  class BodyOrientationListener : public rclcpp::Node
  {

     // HRIListener object to handle information regarding
     // the humans perceived by the robot
      std::shared_ptr<hri::HRIListener> hri_listener_;

     // timer to trigger the detections
     rclcpp::TimerBase::SharedPtr timer_;

     // attention cone semi-amplitude
     double threshold_;

     // vector storing the ids of the bodies detected as oriented
     // toward the robot
     std::vector<std::string> bodies_facing_robot_;

   public:

     BodyOrientationListener();
     ~BodyOrientationListener();

     // Function initialising the members
     void init(const std::string & base_frame, const double & threshold);

     // Function performing the core operations
     //   - body ids retrieval
     //   - body pose retrieval
     //   - computation of the bodies oriented toward the robot
     void run();
  };

This is the declaration of the BodyOrientationListener class. Here, hri_listener_ is a HRIListener object, which abstracts some of the ROS4HRI aspects. For instance, it manages the callbacks reading the lists of detected bodies, faces, voices and persons. As such, you don’t have to define the same callbacks over and over again in different ROS nodes.

There are three more private objects declared:

timer_ defines the running rate of the run function, i.e. how often the run function will be called.
threshold_ defines the semi-amplitude of the interaction cone, i.e. the cone with origin in the body frame within which the base frame will consider the body being oriented towards the robot.
bodies_facing_robot_ will contain the ids of the bodies oriented toward the robot.

There are four public functions declared:

BodyOrientationListener() is the class constructor.
~BodyOrientationListener() is the class destructor.
void init(const std::string& base_frame, const double& threshold) initialises the different parameters of the class.
void run() implements the evaluation process for the bodies orientation.

The definition of the class members happens in body_orientation_listener.cpp file. Let’s have a look!

body_orientation_listener.cpp#

BodyListener()
: rclcpp::Node("body_orientation_listener")
{}

The constructor only sets the name of the node.

body_orientation_listener.cpp#

 void BodyOrientationListener::init(const std::string & base_frame, const double & threshold)
 {
   hri_listener_ = hri::HRIListener::create(shared_from_this());
   timer_ = create_wall_timer(std::chrono::milliseconds(1000), std::bind(&BodyOrientationListener::run, this));
   threshold_ = threshold/180*M_PI;
   hri_listener_->setReferenceFrame(base_frame);
 }

The init function initialises the different parameters:

the threshold_ value is set to threshold after converting it into radians.
the hri_listener_ reference frame is set through the setReferenceFrame method using base_frame from the arguments.
a timer is set to call the main run function a fixed rate (1s).

The whole evaluation process regarding bodies orientation takes place in the void BodyOrientationListener::run() method.

void BodyOrientationListener::run()#

auto bodies = hri_listener_.getBodies();

The hri_listener_ updates the information regarding the bodies. We can access this the data through a std::map object, which contains pairs of body id (std::string) and a pointer (hri::Body) describing body-related information.

You can see how the libhri API simplifies accessing information regarding the bodies detected by the robot (i.e., its body detection pipeline). The abstraction level introduced by the HRIListener class avoids direct interaction between the code and the ROS4HRI topics. Instead, it provides an easy interface to have direct access to objects containing information and utility functions to easily develop HRI applications.

void BodyOrientationListener::run()#

bodies_facing_robot_.clear();

We clear the body ids in the bodies_facing_robot_ vector detected in the previous iteration.

void BodyOrientationListener::run()#

for (auto& body: bodies){

We can now start iterating over the bodies the robot is currently aware of, that is those contained in the bodies object.

void BodyOrientationListener::run()#

if (auto bodyTransform = body_ptr->transform()){

We check whether a body transform exists for the current body. Dealing with tf can be complicated some times, and even if we are expecting a transformation to exist, it is not always the case. The if statement ensures that the transform between the base frame and the body frame actually exists.

This is another example of how the libhri API eases the coding experience. You could have expected needing tf objects to handle transformations. Instead, libhri manages the tf aspects for body frames and directly returns the transformation between the base frame and the body frame.

void BodyOrientationListener::run()#

 geometry_msgs::msg::Transform r2bGM_transform = bodyTransform->transform;

 // tf2::Transform objects declaration for both, the base frame to body
 // frame transformation and its inverse:
 // r2b = robot to body
 // b2r = body to robot
 tf2::Transform r2b_transform, b2r_transform;

 // Here, the geometry_msgs::Transform object is extracted
 // and then transformed into a tf2::Transform object
 tf2::fromMsg(r2bGM_transform, r2b_transform);

bodyTransform is a stamped message, i.e. it contains information regarding the geometric transformation time, frame and child frame. This information is contained in bodyTransform.header. We are only interested in bodyTransform.transform, which contains the geometric information regarding the transformation.

We declare the objects we are going to use to represent the transforms involved in the evaluation process. Here, r2b stands for robot to body, while b2r stands for body to robot.

We extract the transformation information from bodyTransform.transform and transform it into a tf::Transform object.

void BodyOrientationListener::run()#

b2r_transform = r2b_transform.inverse();

Now, the idea is to start working from a body perspective, not a robot one (i.e., inverting the previously obtained geometric transformation). In fact, to understand whether a person is facing a robot or not, you are not interested in the position of the person with respect to the robot, but actually in the position of the robot with respect to the body frame. If the body is oriented towards the robot, with a body frame following the REP 155 frame definition, then the robot frame (i.e., the base frame) origin expressed in body frame coordinates will have a positive \(x\) component and a relatively small \(y\) component.

../_images/body_toward_robot.svg — Fig 1. Human oriented toward the robot.#

../_images/body_not_toward_robot.svg — Fig 2. Human not oriented toward the robot.#

In Fig. 1, you can see how the \(d_b2r\) vector, expressed in body frame coordinates, has a relatively small \(y\) component when compared to the \(x\) one. Differently, in Fig 2. you can see that \(d_b2r\) component has a greater \(y\) component in body frame coordinates, suggesting that the human is not oriented toward the robot.

void BodyOrientationListener::run()#

tf2::Vector3 translation = b2r_transform.getOrigin();

As we showed before, once inverted the original transform, the only information required to understand whether a body is oriented towards a robot or not is the translation vector. In the above line we extract such information.

void BodyOrientationListener::run()#

double translationXY_length =
   std::sqrt(std::pow(translation.x(), 2)
             + std::pow(translation.y(), 2));

In order to establish how much the \(x\) component contributes to the translation vector, we can compare these two’s length. However, we are not interested in the \(z\) component of the body frame to base frame transformation, as this is not providing any information regarding the body orientation (in a scenario where humans are assumed to be always standing). For this reason, we compute the length of the projection of the translation vector on the \(xy\) plane.

void BodyOrientationListener::run()#

double body_orientation =
   std::acos(translation.x()/translationXY_length);

We compute the angle between the \(d_b2r\) \(xy\) plane projection and the :x: axis. Now, this needs to be compared with the threshold_ value to establish whether the body is oriented toward the robot or not.

void BodyOrientationListener::run()#

if ((translation.x() > 0) && (body_orientation < threshold_))
  bodies_facing_robot_.push_back(body.first);

This statement filters the bodies based on their orientation, where the filtering parameter is threshold_. One additional check is required on the \(x\) component, to verify that the robot is actually in front of the body (not at the back). If both conditions are fulfilled, the body id is added to the bodies_facing_robot_ vector.

void BodyOrientationListener::run()#

 for (auto& body: bodies_facing_robot_)
   std::cout << body << " oriented towards the robot\n";

We finally print out the body ids considered to be facing the robot.

void BodyOrientationListener::run()#

 int main(int argc, char * argv[])
 {
   rclcpp::init(argc, argv);
   auto node = std::make_shared<BodyListener>();
   node->init("default_cam", 20);
   std::cout << "Node initialised!\n";
   rclcpp::spin(node->get_node_base_interface());
   rclcpp::shutdown();
   return 0;
 }

The main function first creates the node and then initialises it providing the two required arguments. In this example, we set the base_frame to a generic default_cam value. The frame needs to be part of your robot’s tf frames tree. The threshold is set to 20 degrees. You can play with this value and change it according to your application needs. Finally, we spin the node.

Next steps#

The node you have developed through this tutorial does not really use the information about the bodies oriented toward the robot. You might think about a cool behaviour exploiting this information and implement it!
You might argue that face and gaze orientations could tell us more about a person’s engagement with the robot… and you would be right! Check this tutorial for a possible implementation of an engagement detection node based on face and gaze orientation.
If you’re interested in the python version of this tutorial, check the pyhri tutorial.