How-to: LLM chatbotΒΆ
OverviewΒΆ
chatbot_ollama is a chatbot engine alternative to the default How-to: RASA chatbot. It is based on the Ollama framework, which provides a set of large language models (LLMs) that can be used for various tasks, including conversation, intent recognition and planning.
ROS interfaceΒΆ
chatbot_ollama integrates with communication_hub through the
/chatbot_ollama/start_dialogue
and /chatbot_ollama/dialogue_interaction
interfaces, as described in Chatbot interaction.
If a dialogue is configured for it, the semantic_state_aggregator will be used to ground the dialogue to the current context, including the robotβs state and knowledge.
Additionally, the parameter temperature
may be tuned to control the randomness of the responses.
Perhaps unintuitively, chatbot_ollama is NOT a localized node, since generally LLMs are multi-lingual and do not require a specific language model to be loaded.
Dialogues supportΒΆ
Currently, the only supported role.name
are:
default
: a general-purpose dialogue for chit-chatting. Itsrole.configuration
supports only one role-specific item: theprompt
, which value is a string containing the prompt to be used by the LLM.ask
: a specialized dialogue for asking questions and retrieving information. This is the role used by the /ask action and is already configured using that interface, as described in Communication skills.
One role.configuration
item that is common to all roles is the semantic_state_aggregator
one.
If present, its value must be a serialized JSON object for a
semantic_state_msgs\srv\ConfigureSemanticStateAggregator
request.
It is used to configure the semantic_state_aggregator for this dialogue,
fetching semantic state updates any time a response is required from chatbot_ollama.
See π§ How-to: Using the knowledge base with the chatbot for more information about the semantic state of the robot.
Web interfaceΒΆ
The Web User Interface provides the status of the chatbot_ollama under
Diagnostics > Communication > Chatbot > chatbot_ollama
.
There you can check, among other things:
State
: the current state of the bridge to the Ollama engine;Current dialogues
: the number of open dialogues.
Configure a remote serverΒΆ
Generally speaking, LLMs are more capable then the models running on the RASA engine, but they are also more resource-intensive. For this reason, currently the Ollama engine is not integrated on the robot, but it is advised to run it on a separate machine. The chatbot_ollama uses a REST API to communicate with the Ollama engine, so it can be run on any machine that is accessible from the robot.
To run the Ollama engine on your remote machine of choice, you may:
configure the machine as a server, allowing remote clients like the robot to connect to it;
download and test the model you want to use.
Once we have a remote Ollama server running with your preferred model, we need to configure the robot and chatbot_ollama to connect to it:
test it is reachable from the robot by executing:
set the
server_url
parameter tohttp://<remote_machine_IP>:11434
andmodel_name
parameter to the selected model (you may use the configuration files to set them persistently);using the Web User Interface, restart the
chatbot_ollama
module and check in its log that it successfully activatesfollow the instructions to configure the communication_hub to use the chatbot_ollama
Create or modify a dialogue templateΒΆ
chatbot_ollama offers a customizable, template-based dialogue support.
Each dialogue role
is associated with two
Jinja2 templates:
a
prompt
template, which is used to generate the initial system prompt for the LLM;a
result schema
template, which enforces the LLM responses to a machine-readable JSON schema, separating the uttered response, intents and other role-specific data.
chatbot_ollama expects the templates in the folder specified by the template_dir
parameter.
The parameter value can be an absolute path or a relative one, with the package shared folder as base.
This folder must be structured as follows:
templates/
βββ <role_name_1>/
β βββ prompt.txt.j2
β βββ result_schema.json.j2
βββ <role_name_2>/
β βββ prompt.txt.j2
β βββ result_schema.json.j2
βββ ...
<role_name_N>
must exactly match therole.name
of the dialogue; *prompt.txt.j2
is the Jinja2 template for the system prompt for that role; *result_schema.json.j2
is the Jinja2 template for the result schema for that role.
It generally take a combination of prompt engineering (including directives and examples) and result schema specification to achieve the desired results.
You can see the default template folder at:
$ tree $(ros2 pkg prefix chatbot_ollama)/share/chatbot_ollama/templates
Note
You can notice that there are a couple of additional files in the templates folder root:
base_prompt.txt.j2
base_result_schema.json.j2
These are templates that provide the basic structure of inputs and responses
that chatbot_ollama expects in interfacing with any model.
Both default
and ask
roles extend those templates,
and it is recommended to use them as a base for your custom templates.
Inspecting the default templates, you can see that it is made use of
Jinja variables,
in particular the configuration
one.
This variable is the exact dictionary obtained from deserializing the
role.configuration
field of the dialogue goal.
This structure allows two different degrees of customization:
A new role can be created, adding new templates specifically for this role, and at runtime the user can select one or the other depending on the situation. This is useful when the dialogue purpose or definition is significantly different from all the other ones and can be defined offline. The difference between
default
andask
roles is an example of this.A template can be adapted at runtime using the
configuration
variable. This mechanism can be used to adapt a template to the current context, based on information that is only available at runtime. Theask
role is an example of this, where theconfiguration.result_schema_properties
(typically coming from theanswers_schema
of /ask) is used to define what information should be asked for, while the most of the directives and examples are common regardless of the specific information to be retrieved.
The most configurable part of the templates is the results
object of the result_schema
.
This object should be filled with the final result of the dialogue.
As long as the results
object is empty, the dialogue is not successfully closed by chatbot_ollama.
When the results
object is filled, the dialogue is closed and the result is returned to the client.
For example, the default
role forces the object to always be empty,
which is why the dialogue is never closed by chatbot_ollama.
The ask
role, on the other hand,uses a configuration.result_schema_properties
to specify the expected format of the results
object
(a dictionary with the information requested).
To modify the templates, either modifying an existing one or creating a new one,
it is recommended to copy the entire folder to the the robotβs pal
home,
modify it there and set the template_dir
parameter to point to it.