Can Assistant Threads be stored on our server instead of at OpenAI

Can an Assistant access message threads not stored at OpenAI but stored on our Mongodb? Due to sensitive data.
Or is there no way to use the Assistants functionality without storing the data at OpenAI?

1 Like

Technically, yes, but not officially or simply. The way to do it is to create your own assistant using the platform OpenGPTs from Langchain. langchain-ai/opengpts (, OpenGPTs (

This is incorrect and will mislead people.

The answer is no.

assistants and chat/completions are very different endpoints with very different behavior.

Please don’t tell people “yes,” when the answer is an unequivocal “no.”

1 Like

Hmm, I think you haven’t quite understood, nor have you read what I posted correctly. What I said was that officially it’s not possible, but it is possible to create a custom personal assistant workflow with Langchain, obviously with a higher degree of complexity in exchange for more control and freedom. If you want, check their documentation and verify it for yourself.

It is not possible.

Stop posting misinformation.

The question is,

The answer is, “no.”

Full stop.

Yes, it is possible. I think you are being narrow-minded, and it can be done in several ways. One of them is by creating a custom assistant workflow with OpenGPT. Another, and this one can be done with the OpenAI assistant, is to create a function that retrieves those threads from MongoDB using some RAG technique.

This does not work with the assistants API.
You can build this on top of the Chat completions API if you have a zero-retention agreement.

@jmarlonbasayo if the data is processed by OpenAI it will be stored by OpenAI unless you have a Zero-Retention agreement which is not available for the Assistants API.

You can explain what you think will work but I don’t see any way how the data at OpenAI’s side will suddenly disappear.

This allows you to use Chat Completions and at the same time have a workflow similar to that of the OpenAI assistant, but obviously, it is more complicated and requires more coding. GitHub - langchain-ai/opengpts.

@vikram While we are discussing somewhat related matters: if you need a zero-retention agreement you’ll need to contact sales.

Look, friend, you seem to be misunderstanding.

No one is saying you cannot replicate the assistants endpoint we’re saying that’s not the point and it’s off-topic.

The OP asked about using the assistants endpoint with local storage for data privacy. That’s not possible to do.

As @vb has stated, it’s not even possible to do by reinventing assistants through the chat/completions endpoint without a great deal of effort and a zero-retention agreement.

At which point, why not simply use assistants with a zero-retention agreement.

It’s fine to offer alternatives, but when you write,

You are doing a disservice to your fellow community members because it’s simply not true.

You cannot do this officially or unofficially, it’s simply not possible. You can do something else entirely and get a comparable result, but you still cannot, under any circumstances, let

1 Like

Thank you for pointing this out. I had read that data coming in through the API is not used for training, and assumed that it’s not retained either. I will reach out to Openai regarding supporting our no-retention use-case.

1 Like

Thank you for the clear and direct response.

Thank you for suggesting GitHub - langchain-ai/opengpts as an alternative solution for our use case. I will look further into it.

In the past, before Assistants came out, I had used a Langchain/Pinecone solution but switched in favor of the simplicity Assistants API offers when it came out. Also more importantly, I found that Assistants offer better results with less tokens used as compared to our custom approach.

Does anyone know what algorithm Assistants and ChatGPT use regarding RAG and allocating tokens between prompt (X tokens) + current user message (Y tokens) + message history (N tokens) + relevant retrieval (M tokens)? Or why are the OpenAI ChatGPT/Assistants better than home grown RAG + OpenAI API?

In my old approach X and Y were of course variable depending upon the prompt size and user’s latest input> N and M were fixed in length, assuming large message history and large number of retrieval files exist. So, every back and forth was (X + Y + N + M) tokens, which got very expensive very quickly as our use case required a very large number of relatively small back and forth messages.

In my opinion, developing my own RAG + chat workflow has an invaluable advantage: I don’t depend on the provider for embeddings, text completion, etc. I can use OpenAI today and tomorrow Claude, or whatever on-premise AI solution.

I think that in these days with so many frequent changes in the AI field/market… it’s not a good solution to be married to a single provider.

I suppose that when Sam Altman stepped out of the company for a “few hours,” most of you were fearing about the future of your current AI tech stack already deployed. Be careful with this!