How can I use Embeddings with Chat GPT 3-5 Turbo

ntheodoropoulos · March 7, 2023, 7:32am

Hello everyone,

Using the Embeddings API with Davinci was straightforward. All you had to do was add the embeddings results in the prompt parameter along with the chat history, user question, and so on. However, with the new format of the Chat Completions API, I’m having trouble figuring out how to do it. I tried adding a new object like this: {role: “context”, “content”: <embeddings results>} in the ‘messages’ parameter, but this causes the request to fail with a 400 code.

Does anyone know how to properly use embeddings to expand the model’s knowledge with GPT-3.5 Turbo?

Thank you.

malshe · March 7, 2023, 9:15am

I am also struggling with this. If I find a solution, I will post it here.

ntheodoropoulos · March 7, 2023, 9:22am

I guess one way could be to add it as a user request of the sort:
{role: "user", content: "Context:\<embedding results>. \n\n\n\nQuestion: \<user's last question>"}

But this feels a bit hacky and wondering if there is a better way of doing it

igor.cuic · March 7, 2023, 9:25am

This is how I did it

{"role": "system", "content": "You are a virtual tourism advisor, these are your notes: here comes the text that helps the chatbo to create answers "},

ntheodoropoulos · March 7, 2023, 9:34am

Yea I thought about adding it there as well. However, in the api documentation they state that “In general, gpt-3.5-turbo-0301 does not pay strong attention to the system message, and therefore important instructions are often better placed in a user message.”.

Everything comes down to the definition of important instructions I guess. In any case, if there is no formal way of doing it I might use a variation of the above. Thanks

malshe · March 7, 2023, 9:36am

The best resource I found so far is the OpenAI cookbook. I am checking if by tweaking the method described int his post will work: Question answering using embeddings-based search | OpenAI Cookbook

malshe · March 7, 2023, 9:44am

The idea is not to send the embeddings to GPT-3.5 Turbo. Instead, use the embeddings to find the closest text from the document and then send that as part of the prompt.

ntheodoropoulos · March 7, 2023, 9:52am

Yea yea, my question is where exactly do you put the context (i.e., closest text returned by the embeddings api) in the messages parameter of the chat completions api. There are 2 options so far:

as a user role
as a system role (alongside with the instructions)

I am not sure if any of these options is the official way of doing it. Hell, I am not even sure if there is an official way of doing it

igor.cuic · March 7, 2023, 9:59am

I tried to write with “role system” and it worked. On the other hand, I think that if you write with “role user” it may happen that chatgpt responds with “You are right…” or if you write with “role assistant” it may reply “As I already said…”
but these are my thoughts and sooner or later I will find the right solution.

ntheodoropoulos · March 7, 2023, 10:11am

I think I’ll follow your way until a better solution comes up. Thanks

malshe · March 7, 2023, 10:39am

Check this post: ChatGPT API 101 — A Beginner’s Guide | by Skanda Vivek | Mar, 2023 | Towards AI

ntheodoropoulos · March 7, 2023, 11:26am

That’s interesting. They pass everything (instructions, context, history) as a “role user”. I wonder if this way performs better than also passing “system” and “assistant” roles.

igor.cuic · March 7, 2023, 12:46pm

I haven’t tried it, but I assume that we can use the api as we use Chatgpt. It would then look something like this:

{"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Read this content and confirm with \"...\" that you understand. Answer all my questions based on this text TEXT : \" The new Chat API calls gpt-3.5-turbo, the same model used in the ChatGPT product. It's also our best model for many non-chat use cases; we've seen early testers migrate from text-davinci-003 to gpt-3.5-turbo with only a small amount of adjustment needed to their prompts. Learn more about the Chat API in our documentation. Pricing : It's priced at $0.002 per 1K tokens, which is 10x cheaper than the existing GPT-3.5 models.\""},
        {"role": "assistant", "content": "I read"},
        {"role": "user", "content": "how much does it cost to use the chatgpt api?"}

it works in Playground, so I assume that it can also be used for the API

klcogluberk · March 7, 2023, 2:48pm

Have you found a different way to use embeddings than the role method? What is the best way you have found ? I’m having the same problem and I’m looking for a solution.

ntheodoropoulos · March 7, 2023, 4:47pm

Currently I am using the “system” way. I am passing the instructions as well as the context through system like so:

{"role": "system", "content": "\<Instructions>\n\n\n Context: \<context>"}

Although I have not tried all the methods mentioned above, this approach has been consistently effective for my needs.

It would be nice to receive an official response from someone at OpenAI regarding this question, rather than having to resort to what it seems to be non-standard methods.

klcogluberk · March 8, 2023, 12:07pm

In general, gpt-3.5-turbo-0301 does not pay strong attention to the system message, and therefore important instructions are often better placed in a user message. since this link said so, it made more sense for me to use the user role, and it seems that I will need to use it exactly like this.

{"role": "user", "content": "Context:<embedding result> \n\n\n Questions:<user's last question>"}

If anyone finds a new method up to date, I would be very glad if they write here.

gandhi_gopinath · March 9, 2023, 7:31am

Thanks for the ideas discussed here. The way I understood it is that:

Use embeddings to break knowledge into context-chunks
Find the most relevant context-chunk that corresponds to a query
Pass the context-chunk to gpt-3.5-turbo to generate a human sounding answer

However, don’t we lose a key feature of 3.5 when we go down this path, i.e. remembrance of the conversation-context? Example: if the first question was “How much is this Adidas shoe?”, the gpt-3.5 will answer: “This shoe costs $35”. But if the follow-on question is “Is it available in blue?”, won’t step #2 fail (i.e. when we want to “find the most relevant context-chunk corresponding to the query”)?

Natively, gpt-3.5 will know what “is it available” means. But once we take on the task of “providing context” to gpt-3.5, doesn’t it become our problem?

ruby_coder · March 9, 2023, 8:03am

It only fails if the software developer fails to properly manage an array of messages, which means tracking / storing the prior messages in an array, updating the message array with new messages, and pruning the entire message array based on a pruning strategy.

gandhi_gopinath · March 9, 2023, 8:41am

The issue is not in passing the right array. As part of the System/User message-content that we pass to gpt-3.5, we would need to pass the embedding output also.

For a question like “Is it available in blue?”, the embedding output may be of poor quality because the question (which has only a pronoun) may not match anything relevant in the document (or may match multiple). Passing that to gpt-3.5-turbo might result in GIGO.

ruby_coder · March 9, 2023, 9:01am

This I do not understand. Please enlighten me.

Embeddings are created for the purpose of performing linear algebra (like the dot product) with another embedding vector.

These language models cannot perform linear algebra, so why would you send embedding vectors to a language model?

Topic		Replies	Views
The length of the embedding contents API	48	34574	December 13, 2023
FAQ on custom data to support company internal API	27	5380	December 18, 2023
Chat history and semantic search API	27	12673	May 2, 2024
Train (fine-tune) a model with text from books or articles API	62	28176	November 30, 2023
How do you maintain historical context in repeat API calls? API	29	92213	December 23, 2023

How can I use Embeddings with Chat GPT 3-5 Turbo

Related topics