Usage patterns for tool use (function calling)

The tool use feature of the Chat endpoint comes with a set of capabilities that enable developers to implement a variety of tool use scenarios. This section describes the different patterns of tool use implementation supported by these capabilities. Each pattern can be implemented on its own or in combination with the others.

Setup

First, import the Cohere library and create a client.

Cohere platform

Private deployment

PYTHON

1 # ! pip install -U cohere
2 import cohere
3 
4 co = cohere.ClientV2(
5     "COHERE_API_KEY"
6 )  # Get your free API key here: https://dashboard.cohere.com/api-keys

We’ll use the same search_docs tool as in the previous example.

PYTHON

1 def search_docs(query, top_k=3):
2     # Implement any retrieval logic here (vector DB, keyword search, etc.)
3     return [
4         {
5             "title": "Tool use (function calling) overview",
6             "url": "https://docs.cohere.com/v2/docs/tool-use-overview",
7             "text": "Tool use connects models to external tools like search engines and APIs.",
8         },
9         {
10             "title": "Structured outputs",
11             "url": "https://docs.cohere.com/docs/structured-outputs",
12             "text": "Use JSON schema to define structured inputs/outputs for tools and responses.",
13         },
14         {
15             "title": "Chat API reference (v2)",
16             "url": "https://docs.cohere.com/reference/chat",
17             "text": "Use the Chat endpoint to generate responses and optionally call tools.",
18         },
19     ][:top_k]
20     # Return a string or a list of objects. In Step 3, we'll wrap each object into a `document` content block.
21 
22 
23 functions_map = {"search_docs": search_docs}
24 
25 tools = [
26     {
27         "type": "function",
28         "function": {
29             "name": "search_docs",
30             "description": "Search documentation and return relevant snippets as documents.",
31             "parameters": {
32                 "type": "object",
33                 "properties": {
34                     "query": {
35                         "type": "string",
36                         "description": "The search query to look up in the docs.",
37                     },
38                     "top_k": {
39                         "type": "integer",
40                         "description": "How many documents to return.",
41                     },
42                 },
43                 "required": ["query"],
44             },
45         },
46     },
47 ]

Parallel tool calling

The model can determine that more than one tool call is required, where it will call multiple tools in parallel. This can be calling the same tool multiple times or different tools for any number of calls.

In the example below, the user asks for documentation about tool use and structured outputs. This requires calling the search_docs tool twice, once per topic. This is reflected in the model’s response, where two parallel tool calls are generated.

1 messages = [
2     {
3         "role": "user",
4         "content": "Find docs about tool use and structured outputs.",
5     }
6 ]
7 
8 response = co.chat(
9     model="command-a-03-2025", messages=messages, tools=tools
10 )
11 
12 if response.message.tool_calls:
13     messages.append(response.message)
14     print(response.message.tool_plan, "\n")
15     print(response.message.tool_calls)

Example response:

1 I will search the docs for tool use and structured outputs.
2 
3 [
4     ToolCallV2(
5         id="search_docs_9b0nr4kg58a8",
6         type="function",
7         function=ToolCallV2Function(
8             name="search_docs", arguments='{"query":"tool use","top_k":3}'
9         ),
10     ),
11     ToolCallV2(
12         id="search_docs_0qq0mz9gwnqr",
13         type="function",
14         function=ToolCallV2Function(
15             name="search_docs", arguments='{"query":"structured outputs","top_k":3}'
16         ),
17     ),
18 ]

State management

When tools are called in parallel, we append the messages list with one single assistant message containing all the tool calls and one tool message for each tool call.

PYTHON

1 import json
2 
3 if response.message.tool_calls:
4     for tc in response.message.tool_calls:
5         tool_result = functions_map[tc.function.name](
6             **json.loads(tc.function.arguments)
7         )
8         tool_content = []
9         for data in tool_result:
10             # Optional: the "document" object can take an "id" field for use in citations, otherwise auto-generated
11             tool_content.append(
12                 {
13                     "type": "document",
14                     "document": {"data": json.dumps(data)},
15                 }
16             )
17         messages.append(
18             {
19                 "role": "tool",
20                 "tool_call_id": tc.id,
21                 "content": tool_content,
22             }
23         )

The sequence of messages is represented in the diagram below.

Directly answering

A key attribute of tool use systems is the model’s ability to choose the right tools for a task. This includes the model’s ability to decide to not use any tool, and instead, respond to a user message directly.

In the example below, the user asks a simple arithmetic question. The model determines that it does not need to use any of the available tools (only one, search_docs, in this case), and instead, directly answers the user.

1 messages = [{"role": "user", "content": "What's 2+2?"}]
2 
3 response = co.chat(
4     model="command-a-03-2025", messages=messages, tools=tools
5 )
6 
7 if response.message.tool_calls:
8     print(response.message.tool_plan, "\n")
9     print(response.message.tool_calls)
10 
11 else:
12     print(response.message.content[0].text)

Example response:

1 The answer to 2+2 is 4.

State management

When the model opts to respond directly to the user, there will be no items 2 and 3 above (the tool calling and tool response messages). Instead, the final assistant message will contain the model’s direct response to the user.

Note: you can force the model to directly answer every time using the tool_choice parameter, described here

Multi-step tool use

The Chat endpoint supports multi-step tool use, which enables the model to perform sequential reasoning. This is especially useful in agentic workflows that require multiple steps to complete a task.

As an example, suppose a tool use application has access to a web search tool. Given the question “What was the revenue of the most valuable company in the US in 2023?”, it will need to perform a series of steps in a specific order:

Identify the most valuable company in the US in 2023
Then only get the revenue figure now that the company has been identified

To illustrate this, we’ll use our search_docs tool and ask a question that usually requires multiple searches (first for tool use basics, then for tool_choice).

Here’s the function definition for the tool:

PYTHON

1 def search_docs(query, top_k=3):
2     # Implement any retrieval logic here (vector DB, keyword search, etc.)
3     return [
4         {
5             "title": "Tool use (function calling) overview",
6             "url": "https://docs.cohere.com/v2/docs/tool-use-overview",
7             "text": "Tool use connects models to external tools like search engines and APIs.",
8         },
9         {
10             "title": "Usage patterns for tool use",
11             "url": "https://docs.cohere.com/v2/docs/tool-use-usage-patterns",
12             "text": "Common patterns include parallel tool calling, multi-step tool use, and more.",
13         },
14         {
15             "title": "Structured outputs",
16             "url": "https://docs.cohere.com/docs/structured-outputs",
17             "text": "Use JSON schema to define structured inputs/outputs for tools and responses.",
18         },
19     ][:top_k]
20 
21 
22 functions_map = {"search_docs": search_docs}

And here is the corresponding tool schema:

PYTHON

1 tools = [
2     {
3         "type": "function",
4         "function": {
5             "name": "search_docs",
6             "description": "Search documentation and return relevant snippets as documents.",
7             "parameters": {
8                 "type": "object",
9                 "properties": {
10                     "query": {
11                         "type": "string",
12                         "description": "The search query to look up in the docs.",
13                     },
14                     "top_k": {
15                         "type": "integer",
16                         "description": "How many documents to return.",
17                     },
18                 },
19                 "required": ["query"],
20             },
21         },
22     },
23 ]

Next, we implement the four-step tool use workflow as described in the previous page.

The key difference here is the second (tool calling) and third (tool execution) steps are put in a while loop, which means that a sequence of this pair can happen for a number of times. This stops when the model decides in the tool calling step that no more tool calls are needed, which then triggers the fourth step (response generation).

In this example, the user asks for an explanation of tool use and how to force tool usage, with citations.

PYTHON

1 import json
2 
3 # Step 1: Get the user message
4 messages = [
5     {
6         "role": "user",
7         "content": "Explain how tool use works and how to force tool usage. Please cite your sources.",
8     }
9 ]
10 
11 # Step 2: Generate tool calls (if any)
12 model = "command-a-03-2025"
13 response = co.chat(
14     model=model, messages=messages, tools=tools, temperature=0.3
15 )
16 
17 while response.message.tool_calls:
18     print("TOOL PLAN:")
19     print(response.message.tool_plan, "\n")
20     print("TOOL CALLS:")
21     for tc in response.message.tool_calls:
22         print(
23             f"Tool name: {tc.function.name} | Parameters: {tc.function.arguments}"
24         )
25     print("=" * 50)
26 
27     messages.append(response.message)
28 
29     # Step 3: Get tool results
30     print("TOOL RESULT:")
31     for tc in response.message.tool_calls:
32         tool_result = functions_map[tc.function.name](
33             **json.loads(tc.function.arguments)
34         )
35         tool_content = []
36         print(tool_result)
37         for data in tool_result:
38             # Optional: the "document" object can take an "id" field for use in citations, otherwise auto-generated
39             tool_content.append(
40                 {
41                     "type": "document",
42                     "document": {"data": json.dumps(data)},
43                 }
44             )
45         messages.append(
46             {
47                 "role": "tool",
48                 "tool_call_id": tc.id,
49                 "content": tool_content,
50             }
51         )
52 
53     # Step 4: Generate response and citations
54     response = co.chat(
55         model=model,
56         messages=messages,
57         tools=tools,
58         temperature=0.1,
59     )
60 
61 messages.append(
62     {
63         "role": "assistant",
64         "content": response.message.content[0].text,
65     }
66 )
67 
68 # Print final response
69 print("RESPONSE:")
70 print(response.message.content[0].text)
71 print("=" * 50)
72 
73 # Print citations (if any)
74 verbose_source = (
75     True  # Change to True to display the contents of a source
76 )
77 if response.message.citations:
78     print("CITATIONS:\n")
79     for citation in response.message.citations:
80         print(
81             f"Start: {citation.start}| End:{citation.end}| Text:'{citation.text}' "
82         )
83         print("Sources:")
84         for idx, source in enumerate(citation.sources):
85             print(f"{idx+1}. {source.id}")
86             if verbose_source:
87                 print(f"{source.tool_output}")
88         print("\n")

The model may decide it needs to look up documentation about tool use first, then do a second search for specifics about forcing tool usage via tool_choice.

This is reflected in the model’s response, where multiple tool calling-result pairs can be generated in a sequence.

Example response:

1 TOOL PLAN:
2 First, I will search the docs for how tool use works. Then, I will search for how to force tool usage (tool_choice).
3 
4 TOOL CALLS:
5 Tool name: search_docs | Parameters: {"query":"tool use","top_k":3}
6 ==================================================
7 TOOL RESULT:
8 [{'title': 'Tool use (function calling) overview', 'url': 'https://docs.cohere.com/v2/docs/tool-use-overview', 'text': 'Tool use connects models to external tools like search engines and APIs.'}]
9 TOOL PLAN:
10 Now I'll search for how to force tool usage via the tool_choice parameter.
11 
12 TOOL CALLS:
13 Tool name: search_docs | Parameters: {"query":"tool_choice REQUIRED NONE","top_k":3}
14 ==================================================
15 TOOL RESULT:
16 [{'title': 'Usage patterns for tool use', 'url': 'https://docs.cohere.com/v2/docs/tool-use-usage-patterns', 'text': 'Common patterns include parallel tool calling, multi-step tool use, and more.'}]
17 RESPONSE:
18 Tool use lets models call external tools (like doc search) and then answer using tool results with citations. You can force tool usage with tool_choice="REQUIRED" or force a direct response with tool_choice="NONE".
19 ==================================================
20 CITATIONS:
21 
22 Start: 126| End:135| Text:'tool_choice'
23 Sources:
24 1. search_docs_p0dage9q1nv4:0
25 {'title': 'Usage patterns for tool use', 'url': 'https://docs.cohere.com/v2/docs/tool-use-usage-patterns', 'text': 'Common patterns include parallel tool calling, multi-step tool use, and more.'}

State management

In a multi-step tool use scenario, instead of just one occurence of assistant-tool messages, there will be a sequence of assistant-tool messages to reflect the multiple steps of tool calling involved.

Forcing tool usage

This feature is only compatible with the Command R7B and newer models.

As shown in the previous examples, during the tool calling step, the model may decide to either:

make tool call(s)
or, respond to a user message directly.

You can, however, force the model to choose one of these options. This is done via the tool_choice parameter.

You can force the model to make tool call(s), i.e. to not respond directly, by setting the tool_choice parameter to REQUIRED.
Alternatively, you can force the model to respond directly, i.e. to not make tool call(s), by setting the tool_choice parameter to NONE.

By default, if you don’t specify the tool_choice parameter, then it is up to the model to decide whether to make tool calls or respond directly.

PYTHON

1 response = co.chat(
2     model="command-a-03-2025",
3     messages=messages,
4     tools=tools,
5     tool_choice="REQUIRED" # optional, to force tool calls
6     # tool_choice="NONE" # optional, to force a direct response
7 )

State management

Here’s the sequence of messages when tool_choice is set to REQUIRED.

Here’s the sequence of messages when tool_choice is set to NONE.

Chatbots (multi-turn)

Building chatbots requires maintaining the memory or state of a conversation over multiple turns. To do this, we can keep appending each turn of a conversation to the messages list.

As an example, here’s the messages list from the first turn of a conversation.

PYTHON

1 from cohere import ToolCallV2, ToolCallV2Function
2 
3 messages = [
4     {
5         "role": "user",
6         "content": "How does tool use work in Cohere? Please cite your sources.",
7     },
8     {
9         "role": "assistant",
10         "tool_plan": "I will search the docs for how tool use works in Cohere.",
11         "tool_calls": [
12             ToolCallV2(
13                 id="search_docs_1byjy32y4hvq",
14                 type="function",
15                 function=ToolCallV2Function(
16                     name="search_docs",
17                     arguments='{"query":"tool use Cohere","top_k":3}',
18                 ),
19             )
20         ],
21     },
22     {
23         "role": "tool",
24         "tool_call_id": "search_docs_1byjy32y4hvq",
25         "content": [
26             {
27                 "type": "document",
28                 "document": {
29                     "data": '{"title":"Tool use (function calling) overview","url":"https://docs.cohere.com/v2/docs/tool-use-overview","text":"Tool use connects models to external tools like search engines and APIs."}'
30                 },
31             }
32         ],
33     },
34     {
35         "role": "assistant",
36         "content": "Tool use lets models call external tools (like doc search) and then answer using tool results with citations.",
37     },
38 ]

Then, in the second turn, when provided with a rather vague follow-up user message, the model correctly infers that the context is still about tool use, and searches for information about tool_choice.

1 messages.append(
2     {"role": "user", "content": "How do I force tool usage?"}
3 )
4 
5 response = co.chat(
6     model="command-a-03-2025", messages=messages, tools=tools
7 )
8 
9 if response.message.tool_calls:
10     messages.append(response.message)
11     print(response.message.tool_plan, "\n")
12     print(response.message.tool_calls)

Example response:

1 I will search the docs for how to force tool usage using tool_choice.
2 
3 [ToolCallV2(id='search_docs_8hwpm7d4wr14', type='function', function=ToolCallV2Function(name='search_docs', arguments='{"query":"tool_choice REQUIRED NONE","top_k":3}'))]

State management

The sequence of messages is represented in the diagram below.