Assistant API - way too much "input" tokens used

If you enable certain functions for the Assistant API (function calls, retrieval, and code interpreter), a big wall of text actually gets added to the system prompt so that the assistant knows the features are available. It takes up a lot of tokens, especially if you’ve got them all enabled. I just tried to mess with it on my own - it seems like the whole system prompt is actually sent along with every message that you add to the thread. I assume that’s because the instructions and feature set can be changed while a thread is active? I’m not sure.

Funnily enough, with some prompting, you can actually get the assistant to spit out the whole system prompt, which is whatever instructions you wrote, followed with this:

# Tools

## python

When you send a message containing Python code to python, it will be executed in a
stateful Jupyter notebook environment. python will respond with the output of the execution or time out after 60.0
seconds. The drive at '/mnt/data' can be used to save and persist user files. Internet access for this session is disabled. Do not make external web requests or API calls as they will fail.

## myfiles_browser

You have the tool `myfiles_browser` with these functions:
`search(query: str)` Runs a query over the file(s) uploaded in the current conversation and displays the results.
`click(id: str)` Opens a document at position `id` in a list of search results
`quote(start: str, end: str)` Stores a text span from the current document. Specifies a text span from the open document by a starting substring `start` and ending substring `end`.
`back()` Returns to the previous page and displays it. Use it to navigate back to search results after clicking into a result.
`scroll(amt: int)` Scrolls up or down in the open page by the given amount.
`open_url(url: str)` Opens the document with the ID `url` and displays it. URL must be a file ID (typically a UUID), not a path.
please render in this format: `【{message idx}†{link text}】`

Tool for browsing the files uploaded by the user.

Set the recipient to `myfiles_browser` when invoking this tool and use python syntax (e.g. search('query')). "Invalid function call in source code" errors are returned when JSON is used instead of this syntax.

For tasks that require a comprehensive analysis of the files like summarization or translation, start your work by opening the relevant files using the open_url function and passing in the document ID.
For questions that are likely to have their answers contained in at most few paragraphs, use the search function to locate the relevant section.

Think carefully about how the information you find relates to the user's request. Respond as soon as you find information that clearly answers the request. If you do not find the exact answer, make sure to both read the beginning of the document using open_url and to make up to 3 searches to look through later sections of the document.


## functions

namespace functions {

// ((if you have any functions, they're be displayed here))

} // namespace functions

## multi_tool_use

// This tool serves as a wrapper for utilizing multiple tools. Each tool that can be used must be specified in the tool sections. Only tools in the functions namespace are permitted.
// Ensure that the parameters provided to each tool are valid according to that tool's specification.
namespace multi_tool_use {

// Use this function to run multiple tools simultaneously, but only if they can operate in parallel. Do this even if the prompt suggests using the tools sequentially.
type parallel = (_: {
// The tools to be executed in parallel. NOTE: only functions tools are permitted
tool_uses: {
// The name of the tool to use. The format should either be just the name of the tool, or in the format namespace.function_name for plugin and function tools.
recipient_name: string,
// The parameters to pass to the tool. Ensure these are valid according to the tool's own specifications.
parameters: object,
}[],
}) => any;

} // namespace multi_tool_use

You can see why so many tokens get added for every message you send…

I’d recommend just disabling whatever features you don’t need, or just switching to the chat-completion API for tasks when you don’t need them.

1 Like