Embeddings API and the concept - is it working at all?

I have a large volume of documents that I need to be searchable through OpenAI API, and I understood from everything I read the way to do it is to use OpenAI Embeddings API.

So, I started from a single multi-page document that is all about the specific arrangement for the commencement of my daughter college graduation few years back.

That was a Word document, so I use Word Object Model to segment the document to paragraphs and feed the paragraph text to OpenAI Embeddings API, while the paragraph text itself was part of the Metadata of the produce vector.

Then, I fed the document’s 35 vectors into Pinecone database (free plan).

Finally, I set the prompt and put it through OpenAI Embeddings API, getting back another vector.

Then, I fed the vector to the Pinecone and it produced 10 best scored results.

Then I take the max scored vector and retrieve the associated text from metadata.

Then I feed the text into OpenAI Completion API, but the result is something is not related to commencement of my daughter college but from totally unrelated college…

So, what do I do wrong strategically and/or tactically??

EDIT: Ot maybe I just need to take the most scored vector that Pinecode returns and its associated text as an answer and not to call Completion API - since what GPT model knows about the commencement in question? That seems most logical step… Not sure, since it’s not what I read…

The embedding retrieval will produce a subset of the data that was embedded originally. When you embed your query you are creating a vector, the pinecone system will then look through the database and show you the top (K) entries from it that best match your input vector to those it has already encoded.

The text for the retrieval are usually generated by asking the AI to produce some good vector database retrieval query text to get a response that answers the question (your question here).

Once you have that text you can then run the embedding retrieval on it.

Now, you could stop right there, and show the top 3 highest scoring entries from your retrieval. That seems like it would satisfy your query, or you can then then build a prompt from that data and then run a second LLM prompt to attempt to add real world context to your selected embedded retrieval.

It can get quite involved, so experimentation and spending time getting a feel for how the things interact is required.

2 Likes

So how what you describe is it different from what I already did?

Although, when I re-read your response - I do not understand it, starting with:
The text for the retrieval are usually generated by asking the AI to produce some good vector database retrieval query text to get a response that answers the question (your question here).

What you described is text that does not sound like it came from your original corpus, which means it must have been generated by the AI or the corpus contains information about collages that is not relevant, and your query text was insufficiently specific.

I am suggesting that you either show only the retrieval and not a further chat completions API call or perhaps you might take advantage of the AI prior to making the retrieval call, in order that you might create a more specific query to use.

  1. The document does not contain info about other colleges…
  2. The document text starts with:
    COMMENCEMENT 2013

The Commencement ceremony for the Class of 2013 will be held on Sunday, May 12, 2013 at Benaroya Hall in Seattle, Washington. The ceremony begins promptly at 4 pm and will last until approximately 6 pm. A reception in the main auditorium will immediately follow the ceremony.

Guest tickets or invitations are not required for admittance.

  1. It is basically 3 paragraphs, that is, 3 vectors. Problem #1: it sometimes shows “COMMENCEMENT 2013” that the most scored vector, instead of:
    The Commencement ceremony for the Class of 2013 will be held on Sunday, May 12, 2013 at Benaroya Hall in Seattle, Washington. The ceremony begins promptly at 4 pm and will last until approximately 6 pm. A reception in the main auditorium will immediately follow the ceremony.

and my prompt is “When the Commencement of Cornish College of Arts was planned?”
and the last AI completion call for this question gives me about college in Maryland in 2013.

So, I tend to think that, like you said, I should just get text associated with the max scored vector and that’s it, but what confuse me is all the preaching that it should only be an input to AI model (BTW, I use GPT-3: text-davinci-003)

EDIT:
Here is the interesting part - when I ask:
What was planned in Benaroya Hall in 2013?

I get the answer from Completion API:
Benaroya Hall is in the heart of downtown Seattle between 2nd and 3rd Avenues, and Union and University Streets. Driving directions can be found at the Benaroya Hall website. The address is 200 University Street, Seattle, WA 98101.

where the non-bold part is another paragraph in the document, but the bold part is not!

Anyway, it just does not seem that embeddings working at all in this approach…

Given your prompt of

“When the Commencement of Cornish College of Arts was planned?”

I can see ambiguity and word order issues that may confuse the AI, structurally the prompt should read something more akin to “When was the commencement ceremony of the Cornish Collage of Arts?”

So my question is, have you tried asking the AI to generate the embedding query text from your question text, giving it the prompt “You should generate a semantic search string that would enable accurate retrieval of an embedding vector given the following question : (your question here)”

See how that works for you.

When I need to work with situations like this, it is very important to be most possible specific in the “prompt” executed in the text that had the best score, something like this:

role “system”: Given the context information and not prior knowledge,

answer the user question:

[BEST RANKED TEXT HERE]

role “user”: [user prompt]

1 Like

No, I did not try. To be honest, I do not understand exactly what you are asking here…
Are you asking to use this as a prompt in a Completion call:

“You should generate a semantic search string that would enable accurate retrieval of an embedding vector given the following question : (your question here)”

where (your question here) is
“When was the commencement ceremony of the Cornish Collage of Arts?”

So, the response to this prompt to be used as a prompt for generating embedding vector and then calling pinecone and taking the text that associated with the best scored vector?

You can see my original question and exact precise steps I took, so where do I need to change and what?

Ok, so you have embedded your data, then you want to query that embedded information with a question, I am saying take that question and pass it to the GPT3.5 or 4 API and prompt it with "Given the following request, what would be an ideal retrieval prompt for a vector database {user_prompt} please only output the retrieval prompt and nothing else " then the AI will give you a response, take that response and use that as prompt to query the embedding.

Got. Basically, you suggest enhancing the prompt by GPT-3/3.5 (no access to GPT-4) by inputting it to Completion API and only then use the response to query pinecone DB. And then the text associated with the best scored vector is my response. Well, not sure if it will make the difference… but I will try it.

I can try that too, although BEST RANKED TEXT HERE is the context and not the user question as could be misunderstood the “system” “content”.
Anyway, I will try that too…

1 Like

I tried the following without changing anything in the code:
I rephrased the question:
Please answer concisely and precisely the following question, based on the provided context only and not on the general knowledge. What event was planned in Benaroya Hall in May 2013 in Seattle?

and got the answer:
The Commencement ceremony for the Class of 2013 will be held on Sunday, May 12, 2013 at Benaroya Hall in Seattle, Washington. The ceremony begins promptly at 4 pm and will last until approximately 6 pm. A reception in the main auditorium will immediately follow the ceremony.

*The ceremony will include the traditional procession of graduates, speeches by faculty and student representatives, and the conferring of degrees. All graduates are expected to wear academic regalia (cap and gown). Tickets are not required for admission. *

*Guests should plan to arrive at least 30 minutes prior to the start of the ceremony in order to find seating. Doors open at 3:30 pm. Guests should also be aware that Benaroya Hall has a strict no-food or drink policy, so please plan accordingly. *

We look forward to celebrating this special occasion with you!

It is not exactly what I wanted, which would be: The Commencement Ceremony.
It is unfortunately not so concise… but better than before. I learn from this that you need prompt that give some distinctive hints…

2 Likes

Try putting all of this in a single prompt, from what I’ve tested it is the strictest way to get it to reply using the info provided:

Here is The Information:
“retrieved paragraphs…”

Here is the user’s question:
“What event was planned…”

Answer the user’s question by only rewriting The Information:

1 Like

Thank you, I already figured it out. The “Information” should be the text associated with the best scored vector. Then, it works well.

1 Like

If you have something you’re happy with and you have a moment and are willing, a quick post of what you did and maybe a bit of code for others to follow if they should reach the same point as you would be awesome :smiley:

So, I follow the steps that I outlined above except the last item/paragraph where I feed the OpenAI Completion API. Instead:

  1. I changed the model to gpt-3.5.-turbo
  2. I took the text that is associated with the max scored vector (the Context) and did this:
    [system]: Provide the answer to the question using the Context information only, not the general knowledge.
    [user]:
    What event was planned in Seattle in May 2013?
    Context:
    The Commencement ceremony for the Class of 2013 will be held on Sunday, May 12, 2013 at Benaroya Hall in Seattle, Washington. The ceremony begins promptly at 4 pm and will last until approximately 6 pm. A reception in the main auditorium will immediately follow the ceremony.

The Completion APU returned the short answer:
The Commencement ceremony for the Class of 2013

All of my code is in C# and I use a complex open-source library for OpenAI API. I took it a while ago when I was working on the two products I launched previously:

In retrospect I probably would be better if I used my own code, since the library was not fully baked in Dec 2022/Jan 2023 and I had to change it and simplify it, so it is not really recognizable and readable at the moment…and I’d be ashamed to share it…my project of chatting with private documents is experimentation only at this point…

What makes me mad is that I am trying the same only using Azure OpenAI API and I am persistently getting “Unauthotized” while my Api Key is correct as well as Url.

To be explicit: your original query didn’t work because your prompt did not relate in any way to the context from the database.

I will demonstrate with 3.5:

Your context was:

The Commencement ceremony for the Class of 2013 will be held on Sunday, May 12, 2013 at Benaroya Hall in Seattle, Washington. The ceremony begins promptly at 4 pm and will last until approximately 6 pm. A reception in the main auditorium will immediately follow the ceremony.

Guest tickets or invitations are not required for admittance.

and your query was:

When the Commencement of Cornish College of Arts was planned?

So the completion from 3.5, expectedly, is:

Based on the information provided, the Commencement ceremony for the Class of 2013 at Cornish College of the Arts was not mentioned.

The reason your updated query worked:

Please answer concisely and precisely the following question, based on the provided context only and not on the general knowledge. What event was planned in Benaroya Hall in May 2013 in Seattle?

Because it does relate to the context.

No mystery or magic necessary here. Just examine what you are putting into the prompt.

No, it is because I did not have Context at all in a last completion call…
Anyway, I resolved it a few days ago.

1 Like