Embeddings on partial text

sniperj · December 20, 2023, 11:12am

I want to improve the indexing of a documents I have. I had a theory that one issue is trying to fit the whole document into one vector and I wanted to play with the granularity. so my question is:
Can I get the embedding on a partial text while retaining the context of the whole document?

thanks!

_j · December 20, 2023, 1:13pm

Really, you can send any text you want to the embeddings endpoint and get some sort of semantically-based vector back.

There’s different techniques, but some of them can bias one document over another, or bias one amount of text over another.

You can certainly consider other ways of “databasing” your document, and many do:

make smaller chunks and obtain embeddings for each, and each refers to the whole document,
make smaller chunks, and average the weight of vectors to make a single reference,
Add summary information about the whole document, then more from a section,
(insert where you have more imagination than me…)

Topic		Replies	Views
How similar are vectors for a word/phrase and vector for text that includes the word/phrase API ada002	1	168	May 31, 2024
Searching Using Vectors Derived from Long Text Segments in an Embedding Model API embeddings , api	4	2386	December 15, 2023
Embedding Longer Texts API	8	14963	December 25, 2023
Is it possible to get "context aware" embeddings? API embeddings	9	1999	August 31, 2024
Reasonable text length for embedding API	5	2341	December 25, 2023

Embeddings on partial text

Related topics