The robot can detect and identify faces, detect 2D and 3D skeletons, perform
speech and intent recognition, and fuse together various
social signal to track multi-modal persons.
THe robot’s social perception pipeline is compliant with the ROS4HRI REP-155 ROS standard.
Note that the entire pipeline runs on-board; no cloud-based services are used (and
consequently, no Internet connection is required).
The following figure provides an overview of the pipeline:
These are the main limitations of the pal-sdk-23.12 social perception capabilities:
Person detection and face identification rely on external tools (Google
MediaPipe and dlib). Like all vision-based algorithms, these tools do
not always provide accurate estimate or might mis-detect/mis-classify
people.
Body detection is currently single-body only;
Faces needs to be within a ~2m range of the robot to be detected;
No voice separation, identification or localisation is currently
available: from the robot point of view, it always hears the same one
voice;
PAL’s robots does not yet implement the ‘group interactions’ part of the
specification (e.g. no automatic group detection).
Social perception#
The robot can detect and identify faces, detect 2D and 3D skeletons, perform speech and intent recognition, and fuse together various social signal to track multi-modal persons.
THe robot’s social perception pipeline is compliant with the ROS4HRI REP-155 ROS standard.
Note that the entire pipeline runs on-board; no cloud-based services are used (and consequently, no Internet connection is required).
The following figure provides an overview of the pipeline:
Attention
Limitations
These are the main limitations of the
pal-sdk-23.12
social perception capabilities:Person detection and face identification rely on external tools (Google MediaPipe and dlib). Like all vision-based algorithms, these tools do not always provide accurate estimate or might mis-detect/mis-classify people.
Body detection is currently single-body only;
Faces needs to be within a ~2m range of the robot to be detected;
No voice separation, identification or localisation is currently available: from the robot point of view, it always hears the same one voice;
PAL’s robots does not yet implement the ‘group interactions’ part of the specification (e.g. no automatic group detection).
General documentation#
Tutorials and how-tos#
API reference#