LLM Tools#

An agent in the context of Large Language Models (LLMs) is a loop where a model:

  1. Reads user input + context

  2. Decides whether to call tools or not

  3. Uses tool results to reason

  4. Replies or takes another action

OpenAI’s platform supports this with:

  • Responses API + function tools (a.k.a. function calling) to let the model call your code,

  • Embeddings for retrieval/memory, and

  • An Agents SDK that wraps common agentic patterns (tools, state, tracing) to speed development.

Function Calling#

Tools in the context of LLMs are custom interventions in the default chat loop. For example, you might want to get real-time information from the web, access a database, or perform custom calculations in response to user queries.

In other words, any time you want the model to deviate from its default behavior of generating text based on its training data, you can use tools.



Let’s say the user wanted to know the pre-requisite for CSC-122. Now, the general-purpose model may not know this information, or it may be out of date. Instead, you can define a function that retrieves this information from some source (e.g., a database, an API, or a hardcoded dictionary).

# Create a running input list we will add to over time
input_list = [
    {
        "role":    "user", 
        "content": "What is the pre-requisite for CSC-122?"
    }
]

Then you can create a Tool that intervenes in the default chat loop by calling a function.

  1. Define a list of callable tools for the model

    # Function logic for get_prerequisite
    def get_prerequisite(args):
        course = args['course']
        prereqs = {"CSC-121": "CSC-105",
                "CSC-122": "CSC-121",
                "CSC-223": "CSC-122"}
        return f"The pre-req for {course} is {prereqs[course]}"
    
    # 1. Define a list of callable tools for the model
    tools = [
    {
        "type": "function",
        "name": "get_prerequisite",
        "description": "Get pre-requisite of Computer Science courses at Furman.",
        "parameters": {
            "type": "object",
            "properties": {
                "course": {
                    "type": "string",
                    "description": "Course code in format CSC-XXX",
                },
            },
            "required": ["course"],
        },
    }]
    
  2. Prompt the model with tools defined

    response = client.responses.create(
        model="gpt-5",
        tools=tools,
        input=input_list,
    )
    
  3. Check if the model wants to call a tool. Call it if so, and add the result to the input list.

    # Save function call outputs for subsequent requests
    input_list += response.output
    
    for item in response.output:
        if item.type == "function_call":
            if item.name == "get_prerequisite":
                # 3. Execute the function logic for get_prerequisite
                prereq = get_prerequisite(json.loads(item.arguments))
                
                # 4. Provide function call results to the model
                input_list.append({
                    "type": "function_call_output",
                    "call_id": item.call_id,
                    "output": json.dumps({
                        "prereq": prereq
                    })
                })
    
  4. Get a response from the model, now with the function call result in context

    # print("Final input:")
    # print(input_list)
    
    response = client.responses.create(
      model="gpt-5",
      # instructions="Respond only with pre-reqs generated by a tool.",
      tools=tools,
      input=input_list,
    )
    
    # 5. The model should be able to give a response!
    print("Final output:")
    # print(response.model_dump_json(indent=2))
    print("\n" + response.output_text)
    

File Search Tool#

File search allow models to search your files for relevant information before generating a response.

File search is a tool available in the Responses API. It enables models to retrieve information in a knowledge base of previously uploaded files through semantic and keyword search. By creating vector stores and uploading files to them, you can augment the models’ inherent knowledge by giving them access to these knowledge bases or vector_stores.

To learn more about how vector stores and semantic search work, refer to our retrieval guide.

This is a hosted tool managed by OpenAI, meaning you don’t have to implement code on your end to handle its execution. When the model decides to use it, it will automatically call the tool, retrieve information from your files, and return an output.

Prior to using file search with the Responses API, you need to have set up a knowledge base in a vector store and uploaded files to it.

  1. Create a vector store and upload files to it.

import requests
from io import BytesIO
from openai import OpenAI

client = OpenAI()

def create_file(client, file_path):
    if file_path.startswith("http://") or file_path.startswith("https://"):
        # Download the file content from the URL
        response = requests.get(file_path)
        file_content = BytesIO(response.content)
        file_name = file_path.split("/")[-1]
        file_tuple = (file_name, file_content)
        result = client.files.create(
            file=file_tuple,
            purpose="assistants"
        )
    else:
        # Handle local file path
        with open(file_path, "rb") as file_content:
            result = client.files.create(
                file=file_content,
                purpose="assistants"
            )
    print(result.id)
    return result.id

# Replace with your own file path or URL
file_id = create_file(client, "https://cdn.openai.com/API/docs/deep_research_blog.pdf")
  1. Create a vector store and upload files to it.

vector_store = client.vector_stores.create(
    name="knowledge_base"
)
print(vector_store.id)
  1. Add files to the vector store.

result = client.vector_stores.files.create(
    vector_store_id=vector_store.id,
    file_id=file_id
)
print(result)
  1. Check the status of the file upload.

result = client.vector_stores.files.list(
    vector_store_id=vector_store.id
)
print(result)
  1. Once your knowledge base is set up, you can include the file_search tool in the list of tools available to the model, along with the list of vector stores in which to search.

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4.1",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["<vector_store_id>"]
    }]
)
print(response)

When this tool is called by the model, you will receive a response with multiple outputs:

  1. A file_search_call output item, which contains the id of the file search call.

  2. A message output item, which contains the response from the model, along with the file citations.

{
  "output": [
    {
      "type": "file_search_call",
      "id": "fs_67c09ccea8c48191ade9367e3ba71515",
      "status": "completed",
      "queries": ["What is deep research?"],
      "search_results": null
    },
    {
      "id": "msg_67c09cd3091c819185af2be5d13d87de",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Deep research is a sophisticated capability that allows for extensive inquiry and synthesis of information across various domains. It is designed to conduct multi-step research tasks, gather data from multiple online sources, and provide comprehensive reports similar to what a research analyst would produce. This functionality is particularly useful in fields requiring detailed and accurate information...",
          "annotations": [
            {
              "type": "file_citation",
              "index": 992,
              "file_id": "file-2dtbBZdjtDKS8eqWxqbgDi",
              "filename": "deep_research_blog.pdf"
            },
            {
              "type": "file_citation",
              "index": 992,
              "file_id": "file-2dtbBZdjtDKS8eqWxqbgDi",
              "filename": "deep_research_blog.pdf"
            },
            {
              "type": "file_citation",
              "index": 1176,
              "file_id": "file-2dtbBZdjtDKS8eqWxqbgDi",
              "filename": "deep_research_blog.pdf"
            },
            {
              "type": "file_citation",
              "index": 1176,
              "file_id": "file-2dtbBZdjtDKS8eqWxqbgDi",
              "filename": "deep_research_blog.pdf"
            }
          ]
        }
      ]
    }
  ]
}

Retrieval customization#

Limiting the number of results#

Using the file search tool with the Responses API, you can customize the number of results you want to retrieve from the vector stores. This can help reduce both token usage and latency, but may come at the cost of reduced answer quality.

response = client.responses.create(
    model="gpt-4.1",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["<vector_store_id>"],
        "max_num_results": 2
    }]
)
print(response)

Include search results in the response#

While you can see annotations (references to files) in the output text, the file search call will not return search results by default.

To include search results in the response, you can use the include parameter when creating the response.

response = client.responses.create(
    model="gpt-4.1",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["<vector_store_id>"]
    }],
    include=["file_search_call.results"]
)
print(response)

Meta-data filtering#

You can filter the search results based on the metadata of the files. For more details, refer to our retrieval guide, which covers:

  • How to set attributes on vector store files

  • How to define filters

response = client.responses.create(
    model="gpt-4.1",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["<vector_store_id>"],
        "filters": {
            "type": "eq",
            "key": "type",
            "value": "blog"
        }
    }]
)
print(response)

Supported files#

Supported files For text/ MIME types, the encoding must be one of utf-8, utf-16, or ascii.

File

format MIME type

.c

text/x-c

.cpp

text/x-c++

.cs

text/x-csharp

.css

text/css

.doc

application/msword

.docx

application/vnd.openxmlformats-officedocument.wordprocessingml.document

.go

text/x-golang

.html

text/html

.java

text/x-java

.js

text/javascript

.json

application/json

.md

text/markdown

.pdf

application/pdf

.php

text/x-php

.pptx

application/vnd.openxmlformats-officedocument.presentationml.presentation

.py

text/x-python

.rb

text/x-ruby

.sh

application/x-sh

.tex

text/x-tex

.ts

application/typescript

.txt

text/plain

Retrieval-Augmented Generation (RAG)#

Search your data using semantic similarity.

The Retrieval API allows you to perform semantic search over your data, which is a technique that surfaces semantically similar results — even when they match few or no keywords. Retrieval is useful on its own, but is especially powerful when combined with our models to synthesize responses.

Retrieval depiction

The Retrieval API is powered by vector stores, which serve as indices for your data. This guide will cover how to perform semantic search, and go into the details of vector stores.

1. Create a Vector Store#

  • Create vector store and upload files.

    Vector stores are the containers that power semantic search for the Retrieval API and the file search tool. When you add a file to a vector store it will be automatically chunked, embedded, and indexed.

    Vector stores contain vector_store_file objects, which are backed by a file object.

    Object type

    Description

    file

    Represents content uploaded through the Files API. Often used with vector stores, but also for fine-tuning and other use cases.

    vector_store

    Container for searchable files.

    vector_store.file

    Wrapper type specifically representing a file that has been chunked and embedded, and has been associated with a vector_store.Contains attributes map used for filtering.

Create vector store with files

from openai import OpenAI
client = OpenAI()

vector_store = client.vector_stores.create(        # Create vector store
    name="Support FAQ",
)

client.vector_stores.files.upload_and_poll(        # Upload file
    vector_store_id=vector_store.id,
    file=open("customer_policies.txt", "rb")
)

2. Semantic Search Query#

Semantic search is a technique that leverages vector embeddings to surface semantically relevant results. Importantly, this includes results with few or no shared keywords, which classical search techniques might miss.

For example, let’s look at potential results for "When did we go to the moon?":

Text

Keyword Similarity

Semantic Similarity

The first lunar landing occurred in July of 1969.

0%

65%

The first man on the moon was Neil Armstrong.

27%

43%

When I ate the moon cake, it was delicious.

40%

28%

(Jaccard used for keyword, cosine with text-embedding-3-small used for semantic.)

Notice how the most relevant result contains none of the words in the search query. This flexibility makes semantic search a very powerful technique for querying knowledge bases of any size.

Semantic search is powered by vector stores.

Send search query to get relevant results.

user_query = "What is the return policy?"

results = client.vector_stores.search(
    vector_store_id=vector_store.id,
    query=user_query,
)

To learn how to use the results with OpenAI models, check out the synthesizing responses section.

Performing semantic search

You can query a vector store using the search function and specifying a query in natural language. This will return a list of results, each with the relevant chunks, similarity scores, and file of origin.

Search query

results = client.vector_stores.search(
    vector_store_id=vector_store.id,
    query="How many woodchucks are allowed per passenger?",
)

Results

{
  "object": "vector_store.search_results.page",
  "search_query": "How many woodchucks are allowed per passenger?",
  "data": [
    {
      "file_id": "file-12345",
      "filename": "woodchuck_policy.txt",
      "score": 0.85,
      "attributes": {
        "region": "North America",
        "author": "Wildlife Department"
      },
      "content": [
        {
          "type": "text",
          "text": "According to the latest regulations, each passenger is allowed to carry up to two woodchucks."
        },
        {
          "type": "text",
          "text": "Ensure that the woodchucks are properly contained during transport."
        }
      ]
    },
    {
      "file_id": "file-67890",
      "filename": "transport_guidelines.txt",
      "score": 0.75,
      "attributes": {
        "region": "North America",
        "author": "Transport Authority"
      },
      "content": [
        {
          "type": "text",
          "text": "Passengers must adhere to the guidelines set forth by the Transport Authority regarding the transport of woodchucks."
        }
      ]
    }
  ],
  "has_more": false,
  "next_page": null
}

A response will contain 10 results maximum by default, but you can set up to 50 using the max_num_results param.

Synthesizing responses#

After performing a query you may want to synthesize a response based on the results. You can leverage our models to do so, by supplying the results and original query, to get back a grounded response.

Perform search query to get results

from openai import OpenAI

client = OpenAI()

user_query = "What is the return policy?"

results = client.vector_stores.search(
    vector_store_id=vector_store.id,
    query=user_query,
)

Synthesize a response based on results


formatted_results = format_results(results.data)

'\n'.join('\n'.join(c.text) for c in result.content for result in results.data)

completion = client.chat.completions.create(
    model="gpt-4.1",
    messages=[
        {
            "role": "developer",
            "content": "Produce a concise answer to the query based on the provided sources."
        },
        {
            "role": "user",
            "content": f"Sources: {formatted_results}\n\nQuery: '{user_query}'"
        }
    ],
)

print(completion.choices[0].message.content)
"Our return policy allows returns within 30 days of purchase."

This uses a sample format_results function, which could be implemented like so:

Sample result formatting function

def format_results(results):
    formatted_results = ''
    for result in results.data:
        formatted_result = f"<result file_id='{result.file_id}' file_name='{result.file_name}'>"
        for part in result.content:
            formatted_result += f"<content>{part.text}</content>"
        formatted_results += formatted_result + "</result>"
    return f"<sources>{formatted_results}</sources>"

Key ideas:

• You declare tool schema; the model emits a function call with arguments.

• You run the function and send a tool response back for the model to finish the thought.

Details: function tools / arguments / streaming formats are in the API reference.

End to End example#


# pip install openai numpy
import os, json, math, numpy as np
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# --- Tools ---
def calc(expression: str) -> str:
    return str(eval(expression, {"__builtins__": {}}, {"pi": math.pi, "e": math.e}))

CORPUS = [
  "Furman University is in Greenville, SC.",
  "CSC-475 is an applied data science capstone with BMW collaboration.",
  "Embeddings enable semantic search and RAG."
]
E_MODEL = "text-embedding-3-small"
EMB = np.vstack([client.embeddings.create(model=E_MODEL, input=CORPUS).data[i].embedding for i in range(len(CORPUS))])

def rag_search(query: str, k=3) -> str:
    qv = np.array(client.embeddings.create(model=E_MODEL, input=query).data[0].embedding)
    sims = EMB @ qv / (np.linalg.norm(EMB,axis=1)*np.linalg.norm(qv)+1e-9)
    idx = np.argsort(-sims)[:k]
    return "\n".join([f"- {CORPUS[i]}" for i in idx])

TOOLS = [
  {"type":"function","function":{
    "name":"calc","description":"Evaluate a math expression.",
    "parameters":{"type":"object","properties":{"expression":{"type":"string"}}, "required":["expression"]}
  }},
  {"type":"function","function":{
    "name":"rag_search","description":"Retrieve helpful facts for a user question.",
    "parameters":{"type":"object","properties":{"query":{"type":"string"}}, "required":["query"]}
  }}
]

def step(user_text: str):
    r = client.responses.create(model="gpt-4.1", tools=TOOLS, input=[{"role":"user","content":user_text}])
    while True:
        tool_call = next((b for b in r.output if b.type=="tool_call"), None)
        if not tool_call:
            print("".join(b.text for b in r.output if b.type=="output_text")); break
        name = tool_call.tool_name; args = json.loads(tool_call.arguments)
        result = calc(**args) if name=="calc" else rag_search(**args)
        r = client.responses.create(model="gpt-4.1",
            tools=TOOLS,
            input=[{"role":"user","content":user_text},
                   {"role":"tool","tool_call_id":tool_call.id,"content":result}]
        )

if __name__=="__main__":
    step("Using RAG, tell me what CSC-475 is.")
    step("Compute (sin(pi/2)**2 + cos(0)**2). Use the calculator.")

Remote MCP#

Use connectors and remote MCP (Model Context Protocol) servers to give models new capabilities.



In addition to tools you make available to the model with function calling, you can give models new capabilities using connectors and remote MCP servers. These tools give the model the ability to connect to and control external services when needed to respond to a user’s prompt. These tool calls can either be allowed automatically, or restricted with explicit approval required by you as the developer.

  • Connectors are OpenAI-maintained MCP wrappers for popular services like Google Workspace or Dropbox, like the connectors available in ChatGPT.

  • Remote MCP servers can be any server on the public Internet that implements a remote Model Context Protocol (MCP) server.

This guide will show how to use both remote MCP servers and connectors to give the model access to new capabilities.

Check out the examples below to see how remote MCP servers and connectors work through the Responses API. Both connectors and remote MCP servers can be used with the mcp built-in tool type.

Using remote MCP servers Using connectors Remote MCP servers require a server_url. Depending on the server, you may also need an OAuth authorization parameter containing an access token.

from openai import OpenAI

client = OpenAI()

resp = client.responses.create(
    model="gpt-5",
    tools=[
        {
            "type": "mcp",
            "server_label": "dmcp",
            "server_description": "A Dungeons and Dragons MCP server to assist with dice rolling.",
            "server_url": "https://dmcp-server.deno.dev/sse",
            "require_approval": "never",
        },
    ],
    input="Roll 2d4+1",
)

print(resp.output_text)

Connectors require a connector_id parameter, and an OAuth access token provided by your application in the authorization parameter.

from openai import OpenAI

client = OpenAI()

resp = client.responses.create(
    model="gpt-5",
    tools=[
        {
            "type": "mcp",
            "server_label": "Dropbox",
            "connector_id": "connector_dropbox",
            "authorization": "<oauth access token>",
            "require_approval": "never",
        },
    ],
    input="Summarize the Q2 earnings report.",
)

print(resp.output_text)

The API will return new items in the output array of the model response. If the model decides to use a Connector or MCP server, it will first make a request to list available tools from the server, which will create a mcp_list_tools output item. From the simple remote MCP server example above, it contains only one tool definition:

{
    "id": "mcpl_68a6102a4968819c8177b05584dd627b0679e572a900e618",
    "type": "mcp_list_tools",
    "server_label": "dmcp",
    "tools": [
        {
            "annotations": null,
            "description": "Given a string of text describing a dice roll...",
            "input_schema": {
                "$schema": "https://json-schema.org/draft/2020-12/schema",
                "type": "object",
                "properties": {
                    "diceRollExpression": {
                        "type": "string"
                    }
                },
                "required": ["diceRollExpression"],
                "additionalProperties": false
            },
            "name": "roll"
        }
    ]
}

Listing available tools#

When you specify a remote MCP server in the tools parameter, the API will attempt to get a list of tools from the server. The Responses API works with remote MCP servers that support either the Streamable HTTP or the HTTP/SSE transport protocols.

If successful in retrieving the list of tools, a new mcp_list_tools output item will appear in the model response output. The tools property of this object will show the tools that were successfully imported.

{
    "id": "mcpl_68a6102a4968819c8177b05584dd627b0679e572a900e618",
    "type": "mcp_list_tools",
    "server_label": "dmcp",
    "tools": [
        {
            "annotations": null,
            "description": "Given a string of text describing a dice roll...",
            "input_schema": {
                "$schema": "https://json-schema.org/draft/2020-12/schema",
                "type": "object",
                "properties": {
                    "diceRollExpression": {
                        "type": "string"
                    }
                },
                "required": ["diceRollExpression"],
                "additionalProperties": false
            },
            "name": "roll"
        }
    ]
}