[‼️ROS1] Microphone array and audio recording#
TIAGo Pro and ARI have a ReSpeaker Mic Array V2.0 consisting of 4 microphones, positioned in the head (TIAGoPro) or in the torso, just below the touch-screen (ARI).
Caution
TIAGo (v1) features a simple microphone. Some of the features available on robots equipped witht he reSpeaker microphone array (like sound source localisation or noise cancellation) are therefore not available on this robot.
![../_images/respeaker.png](../_images/respeaker.png)
On ARI, the microphone is connected via USB to the main PC. The PC then outputs audio through the robot’s two speakers, located at each lateral side of the torso, that include a 30W amplifier.
![../_images/audio_flow.png](../_images/audio_flow.png)
Main hardware features:
Support USB Audio Class 1.0 (UAC 1.0)
Four microphones array
Sensitivity: -26 dBFS (omnidirectional)
Signal-to-Noise Ratio: 63dB
12 programmable RGB LED indicators
Note
By default, the microphone LEDs are configured to turn blue when the microphone hears something. In addition, a light blue LED indicates the current sound source direction.
The parameter enable_leds
can be set to False
in the respeaker_ros
launch file to disable this behaviour.
The ReSpeaker microphone also implements several audio processing directly on the hardware:
far-field Voice Activity Detection (up to 5m away);
Direction of Arrival (DoA) estimation;
Beamforming (BF) to focus on sound coming from a specific source;
noise suppression;
de-reverberation;
acoustic echo cancellation, enabling the robot to ignore its own voice.
ROS API#
PAL’s robots relies on an heavily modified version of the open-source respeaker_ros driver.
It exposes the following topics:
/audio_in/raw: Merged audio channel of the 4 microphones (alias for /audio_in/channel0).
/audio_in/channel0: Merged audio channel of the 4 microphones
/audio_in/channel1: Audio stream from the first microphone.
/audio_in/channel2: Audio stream from the second microphone.
/audio_in/channel3: Audio stream from the third microphone.
/audio_in/channel4: Audio stream from the fourth microphone.
/audio_in/channel5: Monitor audio stream from the audio input (used for self-echo cancellation).
/audio_in/voice_detected: Publishes a boolean indicating if a voice is currently detected (ie, whether someone is currently speaking)
/audio_in/speech: raw audio data of detected speech (published once the person has finished speaking).
/audio_in/sound_direction: The estimated Direction of Arrival of the detected sound.
/audio_in/sound_localization: The estimated sound source location.
Audio can then be recorded as a rosbag, for example:
ssh pal@<robot>-0c
rosbag record -O audio_sample.bag /audio/channel0
Regardless on how the audio is captured, it can later be processed.
Recording audio directly from ALSA#
You can also access the ReSpeaker as a regular Linux ALSA recording device:
log onto the robot and it should appear when running the command arecord -l
(and arecord -L
to get the list of ALSA device names).
To record, you first need to stop the ROS driver and find the correct device
name in the arecord -L
list. In the example below we are recording it at 16
KHz sampling rate, recording all 6 channels of the respeaker, for a duration of
10 seconds.
> pal-stop respeaker_ros
> arecord -D "hw:CARD=ArrayUAC10,DEV=0" -fS16_LE -c6 -d 10 -r16000 > audio.wav
Play back the recorded sound:
> aplay audio.wav
You can then re-enable the ROS interface:
> pal-restart respeaker_ros