Social perception with ROS4HRI#

ROS for Human-Robot Interaction (or ROS4HRI [ros4hri]) is the main API that ARI implements to represent information about the human surrounding and interacting with the robot.

ROS4HRI actually refers to a set of conventions and tools that help developing Human-Robot Interaction capabilities. The specfication (orginally developed by PAL Robotics) is available online as the ROS REP-155.

The various IDs of ROS4HRI

The ROS4HRI API defines several types of identifiers (IDs)#

On ARI, we have implemented the following main parts of the specification:

  1. we follow the ROS4HRI human model representation, as a combination of a permanent identity (person) and transient parts that are intermittently detected (e.g. face, skeleton, voice);

  2. in pal-sdk-23.1, we specifically support:

    • face detection and recognition (including extraction of facial landmarks);

    • single body detection, and 2D and 3D skeleton tracking;

    • speech recognition (without support for voice separation or voice identification)

    • probabilistic fusion of faces, bodies and voices

  3. we follow the ROS4HRI topic naming conventions, with all human-related messages published under the /humans/ topic namespace;

  4. we follow the ROS4HRI kinematic model of the human and 3D tf frame conventions (naming, orientation), as specified here.

In addition, ARI also provide:

  • gaze estimation

  • automatic engagement detection

Attention

Limitations

These are the main limitations of the pal-sdk-23.1 social perception capabilities:

  • Person detection and face identification rely on external tools (Google MediaPipe and dlib). Like all vision-based algorithms, these tools do not always provide accurate estimate or might mis-detect/mis-classify people.

  • Body detection is currently single-body only;

  • Faces needs to be within a ~2m range of the robot to be detected;

  • No voice separation, identification or localisation is currently available: from the robot point of view, it always hears the same one voice;

  • ARI does not yet implement the ‘group interactions’ part of the specification (e.g. no automatic group detection).

How to use ROS4HRI?#

The ROS4HRI topics are documented on Social perception topics.

To ease access to these (many!) topics, you can use pyhri (Python) or libhri (C++). These two open-source libraries are developed by PAL Robotics, and are documented on the ROS Wiki:

Next steps#

More tutorials and ‘how-to’ on the Social capabilities main page.

See also#

  • The REP-155 aka ROS4HRI specification, on ROS website

  • The ROS wiki contains useful resources about ROS4HRI.

References#

[ros4hri]

ROS for Human-Robot Interaction Y. Mohamed; S. Lemaignan, IROS 2021, doi: 10.1109/IROS51168.2021.9636816