File Object without uploading to OpenAI

TomMurray · February 23, 2024, 7:20pm

Is it possible to convert a dataset loaded in-memory to an OpenAI file object, and allow an assistant to access the file in messages without uploading this file to OpenAI? The idea is to keep data on an internal database and never upload to OpenAI for increased security.

merefield · February 23, 2024, 8:45pm

No I don’t believe so.

Why not use a function callback from the LLM and implement your search in local code?

TomMurray · February 23, 2024, 9:08pm

Just to make sure I understand, you are suggesting to use Function Calling to access the data I need when prompted, and pass that output to the assistant run?

Will code interpreter be able to use this output and generate / execute code against it? The goal is to use the assistant to analyze data (e.g. basic operations in Pandas).

merefield · February 23, 2024, 9:28pm

That’s correct

You can experiment with different formats of the results set and send this to the assistant together with a prompt for how you’d like the assistant to process the data.

TomMurray · February 23, 2024, 9:35pm

That is great to know. If I needed to load a particular dataset depending on the situation, is there a way to guarantee the function loads the correct one? Or at least raise an error if it does not load the correct one?

To make this concrete, let’s say I have User A and User B, and I want to enable User A to ask questions of dataset A, and User B to ask questions of dataset B. In my function I’ll have to have some logic that says, “This is User A, so let’s load in dataset A”.

I want to be able to ensure the correct loading of the dataset, given I know the relationship between User and dataset in advance.

merefield · February 23, 2024, 9:47pm

In your function you might want to take a parameter that tells you which user it is being called for.

You can then upload your specific results dataset from within your code to a Thread using this endpoint:

https://platform.openai.com/docs/api-reference/files/create

get the ID and then refer to it in your next message.

OR

You could respond with a large message including a data result in an e.g. json format and see how well the language model processes it.

TomMurray · February 23, 2024, 10:17pm

Ah yeah so I am trying to avoid uploading a file to OpenAI due to privacy concerns. I will try returning a json from my function and seeing how well the assistant can process it.

Will the assistant automatically be able to use code interpreter after being given the data?

merefield · February 23, 2024, 10:23pm

You are still sharing data with Open AI despite the latter being less persistent.

Yes, I believe you could prompt it to do that, see:

https://platform.openai.com/docs/assistants/tools/enabling-code-interpreter

TomMurray · February 23, 2024, 10:33pm

Thanks for all your help! I was hoping the data being less persistent would be better security-wise, although I suppose OpenAI will be storing it for some time either way.

merefield · February 23, 2024, 11:04pm

This might be worth a read:

Topic		Replies	Views
Are files uploaded to Assistants API secure? API assistants-api	24	10140	December 15, 2023
API - Code Interpreter - Doubts about data security and Privacy Community chatgpt	6	506	June 3, 2024
File Security/Privacy in Playground API	1	2083	November 7, 2023
New "Assistants" API a potential replacement for low level "RAG" style content generation? API	9	8689	March 4, 2024
Will uploaded files be used for learning? API api	12	5211	December 15, 2023

File Object without uploading to OpenAI

Related topics