Purpose of embedding models

I keep seeing people talking about making apps that handle their documents. Isn’t it basically true that the AI only “knows” what it was trained on, plus some small context that is limited to a few hundred words or 4k tokens?

Is sending an embedding of your document actually doing anything useful?

1 Like

You do not send the embedding.

You use a message from the user to perform a semantic search of your embedded document to locate chunks which have a high likelihood of containing information relevant to answering the question.

You then append those relevant chunks to the context sent to the model in the hopes the model can use them to produce a more correct and appropriate response to the user’s input than it otherwise would be able to.


I was working on a team that was making embedded documents about current events. ChatGPT couldn’t answer any questions about the content of the documents.

In what context would an embedded document produce more relevant answers?

It seems you have a fundamental misunderstanding about how retrieval augmented generation works. This might be a good place to start,

Think of it this way, your brain knows everything you learned back in your uni days. But if you need to know something new, you would need to look it up (say a book in a library) - this process is called RAG (Retrieval Augmentation Generation) and Embeddings help with that (catalogues information and makes it easily available).

As @elmstedt said, the above link is a great place to start to effectively add more information to your LLM.

1 Like

Thanks, I’ll check it out.

So is retrieval augmentation and semantic search able to make gpt answer questions about the document or not?