Using API with a CSV huge table of contents

Hey guys,

I’ve searched a little and found this:

As I’m working on the same concept, I’m trying to feed the OpenAI API Key with a huge table (CSV) that is about 1322168 tokens. There’s no point in sending this amount of tokens per every request (I think that the TPM is 60K).

How can I save or store this huge table in the memory of the GPT? It’s like giving it pre-instructions or having it as a context for any future response.

Thanks!

There is no “memory” to an API language model AI. You provide the entire context from which it must answer into its limited context length.

For a case like that, you will likely need to provide a database query tool along with some education or even fine-tuning of how to make queries to obtain the desired filtered data.

1 Like

Thanks @_j I’d like to ask for some clarifications if I may.

For a case like that, you will likely need to provide a database query tool along with some education or even fine-tuning of how to make queries to obtain the desired filtered data.

database query tool = Framework? like SQL, PostgreSQL, etc?
education = D’you mean, a set of pre instructions how to use the SQL?
fine-tuning = likewise, a set of pre instructions or literally training the model with some samples of obtaining the data through queries?

Thank you so much.

Since you can’t load three million tokens into an AI model, and it wouldn’t be able to “compute” to obtain any realistic results, you instead would need some way to allow the AI to obtain just part of that data as knowledge, where a partial return can answer a desired question.

That is done through function-calling. Offering the AI model a function specification, such as the fields that a database contains and the queries it can issue upon those fields.

That can be a straightforward tailored function that is meant to return particular data (car year and model, with a table of safety ratings, for example). It can be more open-ended, like a SQL query, but the more you rely on the AI to do something novel, the more it may make errors that don’t return the intended data to answer from.

https://platform.openai.com/docs/guides/function-calling

1 Like

Thanks @_j Great answer.
Just to better understand the logic. If that’s how it works under the hood, then how is Code Interpreter feature of OpenAI is different?
Does it also get functions to refer data to the proper programming language or framework?

Code interpreter allows the AI to write python code which can be executed within a Jupyter notebook.

This can allow it to process data with the code it writes. For example “sample and return the first 10 lines of openai_pricing.csv with python code to see the data format, then produce more python that will process that into a markdown table saved as pricing.md” would give multi-step turns for the AI to perform.

However, what can be returned to the AI by Python execution by OpenAI is more limited, currently 32k characters. That also limits understanding.

The function-calling is just having the AI emit an output similar to an API call to your own code. You must handle the tool_call, performing the task that the function specification promised to the AI, and returning the data or status of the function execution.