Is it possible to fine tune the embedding model?

curt.kennedy · March 3, 2023, 8:12pm

At a high level I understand what you are saying, which is, you need high scores on semantic meaning and not word overlap. Got it. Then you say you can achieve this by a NN (two-tower). Got it. Then you say the fine-tuned embedding is the output of your NN. Got it. All of this is fine and good and doesn’t need a direct fine-tune of the original embedding engine, since you are creating them in the output of your NN. I think you answered your own question, which is yes, you can create a fine-tuned embedding, which is created by the output of you own neural net. Totally feasible and makes sense. But you can’t upload some training file to the OpenAI API for embedding-ada-002 and get the same thing. Which is what I thought your original post was about.

And FYI, you can improve the geometry of the embeddings too, I did this in this thread. Some questions about text-embedding-ada-002’s embedding - #42 by curt.kennedy

It removes the mean embedding vector and uses PCA to reduce the dimensions and increase the spread without altering the meaning too much.

So yeah, post-processing of the embeddings is certainly advised and encouraged in certain situations.

Topic		Replies	Views
Quality of embeddings using davinci-001 embeddings model vs. ada-002 model API embeddings	15	3941	April 9, 2024
What's better for the type of chatbot I am building? Fine tune or embedding? Community chatgpt , api	10	2149	August 20, 2023
Reducing Cost of GPT 4 by using embeddings Prompting	23	10406	May 4, 2023
Fine-Tuning plus Embedding API	2	4698	May 3, 2023
Embeddings vs finetunes API	7	2859	January 16, 2023

Is it possible to fine tune the embedding model?

Related topics