Training custom knowledge

Is it possible to add your own knowledge base to a custom model via the API?

What I want is to be able to provide a bunch of text news articles so that the model has knowledge on the events that I can then query in my model.

It seems fine tuning or embedding isnt what im looking for.

Hi @law

Welcome to the community.

Is the knowledge-base, you are trying to add to the model, dynamic? Meaning does it keep changing? If so how often?

1 Like

Hi!

Im looking to add perhaps 100k of text daily.

Basically adding news stories from the web so it can keep up to date with whats going on.

It doesnt change, just appended to existing knowledge base.

1 Like

Very interesting.

It sounds like you are trying to build something similar to webGPT and the new bing.

Finetuning wouldn’t work since the data is real-time, and embeddings would get expensive everyday, still the model wouldn’t be able to carry knowledge of breaking-news level events.

Not to mention curating that kind of data isn’t going to be easy.

However, I may have a rough idea of how you can hack a prototype. Here’s the outline:

  1. You’ll need access to a search API.
  2. Every user message will have to result firstly into a search.
  3. The results semantically closest to the user query will have to be “opened” and the contents retrieved.
  4. Then the embeddings can be obtained for the content and the completions endpoint be used to generate a response.

This is still very hypothetical, but quite intriguing.

1 Like