How-to: LLM chatbotΒΆ

OverviewΒΆ

chatbot_ollama is a chatbot engine alternative to the default How-to: RASA chatbot. It is based on the Ollama framework, which provides a set of large language models (LLMs) that can be used for various tasks, including conversation, intent recognition and planning.

ROS interfaceΒΆ

chatbot_ollama integrates with communication_hub through the /chatbot_ollama/start_dialogue and /chatbot_ollama/dialogue_interaction interfaces, as described in Chatbot interaction.

If a dialogue is configured for it, the semantic_state_aggregator will be used to ground the dialogue to the current context, including the robot’s state and knowledge.

Additionally, the parameter temperature may be tuned to control the randomness of the responses.

Perhaps unintuitively, chatbot_ollama is NOT a localized node, since generally LLMs are multi-lingual and do not require a specific language model to be loaded.

Dialogues supportΒΆ

Currently, the only supported role.name are:

  • default: a general-purpose dialogue for chit-chatting. Its role.configuration supports only one role-specific item: the prompt, which value is a string containing the prompt to be used by the LLM.

  • ask: a specialized dialogue for asking questions and retrieving information. This is the role used by the /ask action and is already configured using that interface, as described in Communication skills.

One role.configuration item that is common to all roles is the semantic_state_aggregator one. If present, its value must be a serialized JSON object for a semantic_state_msgs\srv\ConfigureSemanticStateAggregator request. It is used to configure the semantic_state_aggregator for this dialogue, fetching semantic state updates any time a response is required from chatbot_ollama. See 🚧 How-to: Using the knowledge base with the chatbot for more information about the semantic state of the robot.

Web interfaceΒΆ

The Web User Interface provides the status of the chatbot_ollama under Diagnostics > Communication > Chatbot > chatbot_ollama.

There you can check, among other things:

  • State: the current state of the bridge to the Ollama engine;

  • Current dialogues: the number of open dialogues.

Configure a remote serverΒΆ

Generally speaking, LLMs are more capable then the models running on the RASA engine, but they are also more resource-intensive. For this reason, currently the Ollama engine is not integrated on the robot, but it is advised to run it on a separate machine. The chatbot_ollama uses a REST API to communicate with the Ollama engine, so it can be run on any machine that is accessible from the robot.

To run the Ollama engine on your remote machine of choice, you may:

  1. install it ;

  2. configure the machine as a server, allowing remote clients like the robot to connect to it;

  3. download and test the model you want to use.

Once we have a remote Ollama server running with your preferred model, we need to configure the robot and chatbot_ollama to connect to it:

  1. test it is reachable from the robot by executing:

  2. set the server_url parameter to http://<remote_machine_IP>:11434 and model_name parameter to the selected model (you may use the configuration files to set them persistently);

  3. using the Web User Interface, restart the chatbot_ollama module and check in its log that it successfully activates

  4. follow the instructions to configure the communication_hub to use the chatbot_ollama

Create or modify a dialogue templateΒΆ

chatbot_ollama offers a customizable, template-based dialogue support. Each dialogue role is associated with two Jinja2 templates:

  • a prompt template, which is used to generate the initial system prompt for the LLM;

  • a result schema template, which enforces the LLM responses to a machine-readable JSON schema, separating the uttered response, intents and other role-specific data.

chatbot_ollama expects the templates in the folder specified by the template_dir parameter. The parameter value can be an absolute path or a relative one, with the package shared folder as base.

This folder must be structured as follows:

templates/
β”œβ”€β”€ <role_name_1>/
β”‚   β”œβ”€β”€ prompt.txt.j2
β”‚   └── result_schema.json.j2
β”œβ”€β”€ <role_name_2>/
β”‚   β”œβ”€β”€ prompt.txt.j2
β”‚   └── result_schema.json.j2
└── ...
  • <role_name_N> must exactly match the role.name of the dialogue; * prompt.txt.j2 is the Jinja2 template for the system prompt for that role; * result_schema.json.j2 is the Jinja2 template for the result schema for that role.

It generally take a combination of prompt engineering (including directives and examples) and result schema specification to achieve the desired results.

You can see the default template folder at:

$ tree $(ros2 pkg prefix chatbot_ollama)/share/chatbot_ollama/templates

Note

You can notice that there are a couple of additional files in the templates folder root:

  • base_prompt.txt.j2

  • base_result_schema.json.j2

These are templates that provide the basic structure of inputs and responses that chatbot_ollama expects in interfacing with any model. Both default and ask roles extend those templates, and it is recommended to use them as a base for your custom templates.

Inspecting the default templates, you can see that it is made use of Jinja variables, in particular the configuration one. This variable is the exact dictionary obtained from deserializing the role.configuration field of the dialogue goal.

This structure allows two different degrees of customization:

  1. A new role can be created, adding new templates specifically for this role, and at runtime the user can select one or the other depending on the situation. This is useful when the dialogue purpose or definition is significantly different from all the other ones and can be defined offline. The difference between default and ask roles is an example of this.

  2. A template can be adapted at runtime using the configuration variable. This mechanism can be used to adapt a template to the current context, based on information that is only available at runtime. The ask role is an example of this, where the configuration.result_schema_properties (typically coming from the answers_schema of /ask) is used to define what information should be asked for, while the most of the directives and examples are common regardless of the specific information to be retrieved.

The most configurable part of the templates is the results object of the result_schema. This object should be filled with the final result of the dialogue. As long as the results object is empty, the dialogue is not successfully closed by chatbot_ollama. When the results object is filled, the dialogue is closed and the result is returned to the client.

For example, the default role forces the object to always be empty, which is why the dialogue is never closed by chatbot_ollama. The ask role, on the other hand,uses a configuration.result_schema_properties to specify the expected format of the results object (a dictionary with the information requested).

To modify the templates, either modifying an existing one or creating a new one, it is recommended to copy the entire folder to the the robot’s pal home, modify it there and set the template_dir parameter to point to it.