|   |   | 
ASR, TTS and dialogue management APIs#
ASR API#
The vosk_asr node is in charge of processing the /audio/channel0 input from the ReSpeaker microphone. It fully runs on the CPU (no GPU acceleration currently available).
Once the language is selected, it will start processing the audio. The recognized text is published in the /humans/voices/*/speech topic corresponding to the current voice ID.
Warning
As of pal-sdk-23.12, automatic voice separation and identification is not
available. Therefore all detected speech will be published on the topic
/humans/voices/anonymous_speaker/speech.
The available ROS interfaces to process speech are:
ASR ROS actions#
- /asr/set_locale ROS action (type - i18n_msgs/SetLocaleAction): change the ASR language
ASR published topics#
- /humans/voices/*/speech ROS topic (type - hri_msgs/LiveSpeech): publishes the incremental and final text recognized
- /humans/voices/*/is_speaking ROS topic (type - std_msgs/Bool): publishes a boolean indicating whether a person is speaking or not
- /humans/voices/*/audio ROS topic (type - audio_common_msgs/AudioData): republishes the /audio/channel0 processed audio topic coming from the ReSpeaker array
Wake-up word#
Wake-up word ROS services#
- /wakeup_monitor/enable (type - soft_wakeup_word/Enable): enable/disable the monitoring
- /wakeup_monitor/set_wakeup_pattern (type - soft_wakeup_word/SetWakeupPattern): set a custom ‘wakeup’ pattern (C++ regular expression)
- /wakeup_monitor/get_wakeup_pattern (type - soft_wakeup_word/GetWakeupPattern): get the current wake-up pattern
- /wakeup_monitor/set_sleep_pattern (type - soft_wakeup_word/SetSleepPattern): set a custom ‘sleep’ pattern (C++ regular expression)
- /wakeup_monitor/get_sleep_pattern (type - soft_wakeup_word/GetSleepPattern): get the current ‘sleep’ pattern
Wake-up word published topics#
- /active_listening (type - std_msgs/Bool): whether or not the robot is ‘awake’ and should actively process incoming speech. In particular, this topic is used by the dialogue manager (chatbot) to decide to process or not incoming speech.- Note that you can manually publish - trueor- falseon this topic to manually activate or disactivate the processing of incoming speech by the chatbot.
Chatbot/Dialogue management#
The chatbot engine comes with the following set of ROS interfaces:
Chatbot ROS actions#
- /chatbot/set_locale (type - i18n_msgs/SetLocaleAction): changes the current language of the RASA chantbot engine.
Warning
Switching RASA language takes about 30 seconds, depending on the size of the model.
Chatbot ROS services#
- /active_chatbot (type - chatbot_msgs/GetActiveChatbot): returns the chatbot that is currently running on the robot.
Topics subscribed to by the chatbot#
- the main input of the chatbot is the text recognised from the people speaking around the robot on the /humans/voices/*/speech family of topics. - You can manually publish text on one of these topics to test the chatbot behaviour. 
- /active_listening: the chatbot only processes text input if the last value published on /active_listening is - true. If not, the text input is ignored.- You can use this topic to temporarily disable the chatbot (for instance, if you want to process the ASR speech yourself). 
Topics published by the chatbot#
- /intents (type - hri_actions_msgs/Intent): the chatbot publishes on the- /intentstopic the recognised ROS intents (for instance,- BRING something somewhere). See Intents for details on intents.
Note
Note that some interactions do not lead to intents being published, are they are fully handled within the chatbot engine. For instance, if you say ‘Hi’ to the robot, it will directly reply with a greeting, without emitting a dedicated intent.
Text-to-speech (TTS)#
TTS ROS actions#
- /tts(documentation)