The robot can detect and identify faces, detect 2D and 3D skeletons, perform
speech and intent recognition, and fuse together various
social signal to track multi-modal persons.
The robotβs social perception pipeline is compliant with the ROS4HRI REP-155 ROS standard.
Note that the entire pipeline runs on-board; no cloud-based services are used (and
consequently, no Internet connection is required).
The following figure provides an overview of the pipeline:
These are the main limitations of the PAL OS 25.01 social perception capabilities:
Person detection and face identification rely on external tools (Google
MediaPipe and dlib). Like all vision-based algorithms, these tools do
not always provide accurate estimate or might mis-detect/mis-classify
people.
Body detection is limited to 5 simultaneously detected bodies;
No voice separation, identification or localisation is currently
available: from the robot point of view, it always hears the same one
voice;
The current robot system does not yet implement the βgroup interactionsβ
part of the specification (e.g. no automatic group detection).
π₯ Social perception#
The robot can detect and identify faces, detect 2D and 3D skeletons, perform speech and intent recognition, and fuse together various social signal to track multi-modal persons.
The robotβs social perception pipeline is compliant with the ROS4HRI REP-155 ROS standard.
Note that the entire pipeline runs on-board; no cloud-based services are used (and consequently, no Internet connection is required).
The following figure provides an overview of the pipeline:
Attention
Limitations
These are the main limitations of the PAL OS 25.01 social perception capabilities:
Person detection and face identification rely on external tools (Google MediaPipe and dlib). Like all vision-based algorithms, these tools do not always provide accurate estimate or might mis-detect/mis-classify people.
Body detection is limited to 5 simultaneously detected bodies;
No voice separation, identification or localisation is currently available: from the robot point of view, it always hears the same one voice;
The current robot system does not yet implement the βgroup interactionsβ part of the specification (e.g. no automatic group detection).
General documentation#
Tutorials and how-tos#
API reference#