Question about Files, Assistants & pre-trained data

Hello All

I have waded through the documentations and perhaps I missed it and would like to know before i go and do some tests. So here goes…

On a particular topic, let us say that OpenAi already has data in terms of statistics, etc. Let us say that, on the same topic, I upload some of my own data that may have some kind of an overlap with the existing data and will have some new data as well

Now, when I use this file with the assistants API, will OpenAI use its own data where my file missed it?
And, how will it handle overlaps? Which data takes precedence?

Thanks a ton, in advance

1 Like

In my experience when working with libraries that had breaking changes in newer versions the model would almost always refer to the training data unless specifically instructed to treat provided input as correct.

This can be done via few-shot prompting or specifically mentioning the relevant documents that should supersede the training data.

In the end you will have to make you own tests but in general the model will use what it has learned first unless instructed differently.

2 Likes