Presently, I am using the gpt-3.5-turbo and gpt-4-turbo models. According to the documentation, these models are trained with data upto DEC 2023. However, the application for which I am using these models requires the model to stay updated with the latest data i.e. 2024. So is there a way that I can update these models with the latest data (as of the current date or a couple of days prior) and use them.
The short answer is no. Training these models requires an immense effort and takes a significant amount of time.
The longer answer is no, but depending on what you need it to know you could possibly use an RAG solution like the Assistants API to give it one or two documents. This would be very difficult to keep constantly up to date though.
I don’t think there’s a way for this. I had a similar requirement and I used Python with BeautifulSoup combined with Google Library to fetch latest information about a specific search term and used GPT to summarize it.
In such applications, you need pre-prepare data, and employ the RAG (Retrieval-Augmented Generation) method to acquire new data.
Fine-tuning is not a suitable method for teaching general knowledge to the model.
The following page might be somewhat helpful for more details on RAG, but it’s not as easy as it sounds.
@trenton.dambrowitz
Thanks for the suggestion. I have one doubt, that after providing custom knowledge (one or two documents), will the LLM only look at that provided knowledge or would it use the provided knowledge as an additional source of information along with its existing knowledge to generate the response.
In case it only looks at the provided information, kindly suggest a solution where in the GPT model looks at both the provided knowledge as well as its existing knowledge to answer a prompt, as this is what my application requires.