Tutorial: Detect people oriented toward the robot (Python)#
๐ Goal of this tutorial
By the end of this tutorial, you will know how to establish which bodies are oriented towards the robot using
pyhri
and its API. This information can be useful when you want the robot to be socially proactive and try to attract people that might be looking at it from a distance, unsure whether getting closer to the robot or not.
Prerequisites#
This tutorial requires you to be familiar with the concepts of reference frames, 3D rotation transformations and 3D translation transformations.
This tutorial assumes that you already have an up and running body detection pipeline. The pipeline must be ROS4HRI compatible. If you do not have one available, you can start using hri_fullbody.
This tutorial assumes that you already have installed
pyhri
. If you do not have it yet, please refer to the official repo.
Note
Your robot already comes with hri_fullbody
and pyhri
installed.
The code#
We next provide an example of a node that detects bodies oriented towards the robot.
1 #!/usr/bin/env python3
2
3 import rclpy
4 from rclpy.node import Node
5 import tf_transformations as t
6
7 import numpy as np
8 import hri
9
10
11 class BodyOrientationListener(Node):
12
13 def __init__(self, base_frame="camera_link", threshold=30):
14 """ Constructor, defines some of the class
15 objects that will be used later in the code
16 """
17
18 super().__init__('body_orientation_listener')
19 # HRIListener objects help managing
20 # some ROS4RI related aspects.
21 # Refer to the official API documentation
22 # for a better understanding
23 self.hri_listener = hri.HRIListener("hri_listener_node")
24 self.hri_listener.set_reference_frame(base_frame)
25
26 # half of the amplitude of the attention cone
27 self.threshold = threshold
28
29 # the frequency for the run function execution
30 timer_period = 1 # seconds
31 self.timer = self.create_timer(timer_period, self.run)
32
33 def run(self):
34 """ The run function implement the main functionality of the
35 BodyOrientationListener object: computing which bodies
36 are orienteds toward the robot. The base_frame specified
37 during the object initialisation represent the robot.
38 """
39
40 # an iterable object representing the bodies
41 # detected through the body detection pipeline
42 bodies = self.hri_listener.bodies
43
44 # the list of the ids representing
45 # the bodies oriented toward the robot
46 bodies_facing_robot = []
47
48 # this loop iterates over every single body
49 # detected by the body detection pipeline
50 for body in bodies:
51 print("Evaluating body id: %s" % body)
52 # the body frame to base frame
53 # (that is, the robot frame) transformation is required.
54 # Since the pyhri API provides the base frame to
55 # body frame transformation, we have to invert it
56 # a PoseStamped object representing the base frame
57 # to body frame transformation
58 if bodies[body].valid: # make sure it still has valid info
59 transform = bodies[body].transform
60 else:
61 print("Body %s information not valid anymore" % body)
62 continue
63
64 # the translational and rotational components
65 # of the base frame to body frame transformation,
66 # expressed as a 3D vector and a quaternion
67 trans = transform.transform.translation
68 rot = transform.transform.rotation
69
70 # the homogenous transformation matrix representing
71 # the base frame to body frame transformation
72 # (translation only)
73 translation_matrix = t.translation_matrix((trans.x,
74 trans.y,
75 trans.z))
76
77 # the homogenous transformation matrix representing
78 # the base frame to body frame transformation
79 # (rotation only)
80 quaternion_matrix = t.quaternion_matrix((rot.x,
81 rot.y,
82 rot.z,
83 rot.w))
84
85 # the homogenous transformation matrix representing
86 # the base frame to body frame transformation
87 transform = t.concatenate_matrices(translation_matrix,
88 quaternion_matrix)
89
90 # the inverse of the transform matrix,
91 # that is, the homogenous transformation matrix
92 # for the body frame to base frame transformation
93 inv_transform = t.inverse_matrix(transform)
94
95 # b2r = body to robot
96 # the x and y component of the body frame
97 # to base frame translation transformation
98 b2r_translation_x = inv_transform[0, 3]
99 b2r_translation_y = inv_transform[1, 3]
100
101 # the norm of the projection on the xy plane
102 # of the body frame to base frame translation
103 b2r_xy_norm = np.linalg.norm([b2r_translation_x,
104 b2r_translation_y],
105 ord=2)
106
107 # this if statement checks whether the base frame
108 # lies inside the body frame-based cone of attention
109 # with 2*threshold amplitude or not. When it does,
110 # the body id is appended to the list of the bodies
111 # oriented toward the robot
112 if ((np.arccos(b2r_translation_x/b2r_xy_norm) <
113 (self.threshold/180*np.pi)) and
114 b2r_translation_x > 0):
115 bodies_facing_robot.append(body)
116
117 # print the list of bodies oriented towards the robot
118 print("Bodies facing the robot: ", bodies_facing_robot)
119
120
121 if __name__ == "__main__":
122 # initialise rclpy library
123 rclpy.init(args=None)
124
125 # instantiate a BodyOrientatationListener
126 bol = BodyOrientationListener(
127 base_frame="default_cam", # If using webcam
128 threshold=20)
129
130 # spin BodyOrientationListener so its callback (the run function) is
131 # called
132 rclpy.spin(bol)
The code explained#
Note
In this section, we will use the pyhri
and tf
APIs.
If you are not familiar with them, or want to take a deeper dive, refer
to the
pyhri API documentation
and the
tf API documentation.
As a first step the code defines a new class,
class BodyOrientationListener
. This class processes the bodies
orientation to establish which ones are oriented towards the robot.
1 class BodyOrientationListener(Node):
2
3 def __init__(self, base_frame="camera_link", threshold=30):
4 """
5 Constructor, defines some of the class
6 objects that will be used later in the code
7 """
8 super().__init__('body_orientation_listener')
9 # HRIListener objects help managing
10 # some ROS4RI related aspects.
11 # Refer to the official API documentation
12 # for a better understanding
13 self.hri_listener = hri.HRIListener("hri_listener_node")
14 self.hri_listener.set_reference_frame(base_frame)
15
16 # half of the amplitude of the attention cone
17 self.threshold = threshold
18
19 # the frequency for the run function execution
20 timer_period = 1 # seconds
21 self.timer = self.create_timer(timer_period, self.run)
Here, self.hri_listener
is a HRIListener
object. HRIListener
abstracts some of ROS4HRI aspects: for instance, it manages the callbacks
reading the lists of detected bodies, faces, voices and persons. This way,
you donโt have to define the same callbacks over and over again in different
ROS nodes. Later, you will discover more about the HRIListener
and pyhri
API capabilities.
We also initialise two more objects:
self.threshold
defines the amplitude of the interaction cone, i.e. the cone with origin in the body frame within which the base frame can be to consider the body oriented toward the robot.self.timer
defines a timer to call therun
functiontimer_period
seconds (in this case, 1 second).
The core process happens entirely in the run
function. It defines the
geometric process to evaluate which bodies are oriented towards the robot.
1 bodies = self.hri_listener.bodies
2
3 bodies_facing_robot = []
4
5 for body in bodies:
The HRIListener
object provides a dictionary of ids and objects
representing the detected bodies. bodies_facing_robot
is initiated as
an empty list, which will be used to keep track of the bodies oriented towards
the robot. We next loop over the bodies. Letโs see what happens inside!
1 if bodies[body].valid: # make sure it still has valid info
2 transform = bodies[body].transform
3 else:
4 print("Body %s information not valid anymore" % body)
5 continue
Here you can see another example where the pyhri
API simplifies the
management of human related information. Since the body detection
pipeline is ROS4HRI compatible, it publishes a frame for each body representing
it in the space. The frame syntax is body_<body_id>
. There is no need
to create a tf.transform_listener
object, since the hri.Body
object already provides the transformation between the base frame and the
aforementioned body frame. Note that before accessing the transformation, we
make sure there is still valid data.
Now, the idea is to start working from a body perspective, not a robot one. In fact, to understand whether a person is facing a robot or not, you are not interested in the position of the person with respect to the robot, but actually in the position of the robot with respect to the body frame. If the body is oriented towards the robot, with a body frame following the REP 155 frame definition, then the robot frame (i.e., the base frame) origin expressed in body frame coordinates will have a positive \(x\) component and a relatively small \(y\) component.
In Fig. 1, you can see how the \(d_{b2r}\) (where \(b2r\) stands for body-to-robot), expressed in body frame coordinates, has a relatively small \(y\) component when compared to the \(x\) one. In contrast, in Fig. 2 you can see that the \(d_{b2r}\) component has a greater \(y\) component in body frame coordinates, suggesting that the human is not oriented toward the robot.
The transform
previously highlighted in the code provides
the base frame to body frame transformation. Therefore, to
execute the geometric reasoning we have previously described,
the transformation has to be inverted.
1trans = transform.transform.translation
2rot = transform.transform.rotation
3
4translation_matrix = t.translation_matrix((trans.x,
5 trans.y,
6 trans.z))
7
8quaternion_matrix = t.quaternion_matrix((rot.x,
9 rot.y,
10 rot.z,
11 rot.w))
12
13transform = t.concatenate_matrices(translation_matrix,
14 quaternion_matrix)
15
16inv_transform = t.inverse_matrix(transform)
You obtain the translation and rotation homogenous matrices, multiply them
(t.concatenate_matrices
) and finally invert the result, to obtain the
homogenous matrix representing the transformation from the body frame to the
robot frame in body frame coordinates. The forth column of this matrix
represents the base frame origin expressed in body frame origin, that is
\(d_{b2r}\) from Fig. 1 and Fig. 2.
At this point, all you have to do is check whether the base frame origin lies
inside of the attention cone defined by the threshold
parameter
previously defined.
1 # b2r = body to robot
2 b2r_translation_x = inv_transform[0, 3]
3 b2r_translation_y = inv_transform[1, 3]
4
5 b2r_xy_norm = np.linalg.norm([b2r_translation_x,
6 b2r_translation_y],
7 ord = 2)
8
9 if ((np.arccos(b2r_translation_x/b2r_xy_norm) <
10 (self.threshold/180*np.pi)) and
11 b2r_translation_x > 0):
12 bodies_facing_robot.append(body)
We first compute the distance between the body frame and the base frame. Then,
we use this information to compute the angle between the body frame \(x\)
axis and the \(d_{b2r}\) vector. If this angle is smaller than the
threshold
, then the body id is appended to bodies_facing_robot
.
1 print("Bodies facing the robot: ", bodies_facing_robot)
The last line of the run
function prints the ids of the bodies oriented
towards the robot.
In main
, you can find the instructions to initialise the context and the
node.
1 if __name__ == "__main__":
2 # initialise rclpy library
3 rclpy.init(args=None)
4
5 # instantiate a BodyOrientatationListener
6 bol = BodyOrientationListener(
7 base_frame="default_cam", # If using webcam
8 threshold=20)
9
10 # spin BodyOrientationListener so its callback (the run function) is
11 # called
12 rclpy.spin(bol)
Next steps#
The node you have developed through this tutorial does not really use the information about the bodies oriented toward the robot. You might think about a cool behaviour exploiting this information and implement it!
You might argue that face and gaze orientations could tell us more about a personโs engagement with the robotโฆ and you would be right! Check this tutorial for a possible implementation of an engagement detection node based on face and gaze orientation.
If yourโre interested in the
C++
version of this tutorial, check the libhri tutorial.