Is it possible set weights to different parts of the text sent to Embedding API

dogao · December 21, 2022, 9:44pm

is it possible set weights to different parts of the text sent to Embedding API

I want to compare multiple vectors returned from Embedding API, however I want it to give emphasis on certain parts of the text sent to the Embedding API, because I believe these parts will be more defining for each entity. I believe this is not possible, right ?

More over, I want to know your opinions about what I am trying to archive, does it makes reasonable sense ? Should I leave it to GPT automatically give emphasis on what is more defining for each text ? Should this be approached with testing and comparing the results ?

nelson · December 22, 2022, 4:27am

It is to my knowledge that you cannot change the emphasis of the Embedding API, because all it does is to give you a vector representation of the text from a pre-trained model.

So to change the emphasis, you would have to trained your own model. See Training Overview — Sentence-Transformers documentation

But it most cases, this is not necessary. What is your use case?

dogao · December 22, 2022, 1:42pm

Thanks for the info. I get it.

Another option that I thought would be to get this specific term that I want to give more emphasis and get an embedding vector individually for it. Then I would join the two vectors (the original + specific text that I want to give emphasis), either summing or appending. Any thought on this idea ?

My case is simple to find other vectors closet to my original vector, however I want the give emphasis on certain elements from the full body text, I want it to consider that specific a specific text more heavily when trying to find the closest vector.

nelson · December 22, 2022, 2:16pm

Ya, I see what you mean.
I was thinking this might be useful for you, but you might have to make minor changes.
Let me know how it goes.

dogao · December 23, 2022, 1:58pm

@nelson you are great! ; )

I could not quite understand the relation between embeddings api and completion in the spreadsheet, to me it seems like 2 different and independent tasks

nelson · December 23, 2022, 5:24pm

@dogao That is correct, they are two different tasks, you can use it independently or together.
The embeddings feature is used to search text, so if you have a lot of content you can use the gsheet/data tab as a database. The completion feature only call the API and by using gsheet, you can chain together the API request and response to create your own workflow.

Topic		Replies	Views
Text completion API receiving embeddings from Embeddings API API	9	1878	December 17, 2023
Send CSV file for use in Chat Completion? API	19	23179	December 13, 2023
Passing token weights to embeddings API? API embeddings	5	1550	November 14, 2023
About the usage of ChatGPT Embedding API	9	4144	August 18, 2023
Creating a support chat bot for my business API	4	3458	December 18, 2023

Is it possible set weights to different parts of the text sent to Embedding API

Related topics