Chat Completion Architechture

SomebodySysop · April 4, 2023, 8:03am

I watched this video GPT-4 & LangChain Tutorial: How to Chat With A 56-Page PDF Document (w/Pinecone) - YouTube

Am curious about this chat completion architecture:

I still don’t understand why, in their architecture, they:

“chat history + new question → submit to gpt to generate new standalone question → vector search with new standalone question”

instead of “chat history + new question → vector search”.

This creates three OpenAI api calls per query (creation of standalone question, vectorization of question and standalone question + context documents).

And, I’m also not clear on just what goes into the chat history? The original questions, context documents submitted, standalone questions created and responses OR just standalone questions and responses?

Could someone help me out here?

AgusPG · April 4, 2023, 10:56am

I explained this idea of the stand-alone question here. Glad to see that it’s being used in other places!

You need to “de-contextualize” the question in a conversational context to do a proper semantic search. Imagine the conversation:

User: What is the capital of Spain?
Assistant: It is Madrid.
User: How many people do they live in there?

If you embed the question “How many people do they live in there?” and conduct the semantic search, you won’t retrieve documents that talk specifically about the population of Madrid. This is because it is a “contextual” question: it only makes sense in the context of the on-going conversation. You can solve this by using a module that de-contextualizes the contextual question into a “stand-alone” one. Something like “What is the population of Madrid?”

You can easily achieve this with an additional call to OpenAI. In my case, the “chat history” is only composed by the previous QAs: no documents. You don’t need the supporting documents to reformulate the contextual question into a stand-alone one: previous utterances are enough. In fact, I only send three QAs pairs and that’s usually more than enough to produce the stand-alone question.

Hope it helps

SomebodySysop · April 4, 2023, 3:47pm

Yes! Thank you! This makes perfect sense.

So, my chat history would be:

1st question
1st response

2nd standalone question
2nd response

etc…

Many thanks for helping me to understand this, finally!

AgusPG · April 4, 2023, 4:01pm

Of course. Very happy to help!

anon10827405 · April 4, 2023, 4:09pm

Great explanation.

Sometimes a conversation of 8 message pairs is required to fully contextualize and answer the question.
Sometimes a conversation of 12 pairs only needs the last message.

AgusPG · April 4, 2023, 4:19pm

Good insights. Obviously, in the answering stage you try to send a huge portion of the on-going conversation (in combination with supporting docs and the contextual question itself). As much info as you can.

But in order of de-contextualizing only, in my experience usually a small number of utterances is more than enough. This is because each one of these interactions already have a lot of context about the current conversational topic. I hardly ever (or never) run into situations where, in order to fully characterize the stand-alone question, I need to go back to more than 5 previous utterances.

Also: you need to explicitly instruct the module to produce the same question if the “contextual” question is already a “stand-alone” one, that can be understood without the previous context.

Anyways, as this “de-contextualizer” only has previous QA as input, you have a lot of tokens to play with . You can perfectly pass a lot of previous utterances if your conversational context needs so.

SomebodySysop · April 26, 2023, 10:07am

I have not done this yet. But, I have completed the coding of my chat completion program, and it is working like a charm. I am chatting with my documents, and loving it! It’s really remarkable how little information, as you state, is necessary to maintain the context in chat history.

As I sit here in utter amazement at this accomplishment, I just wanted to thank you again for your assistance in helping me understand this process.

AgusPG · April 26, 2023, 4:42pm

Thank you for such positive feedback! @SomebodySysop. I’m really happy to hear that this is being useful to somebody else!

SomebodySysop · May 20, 2023, 12:30am

This “standalone question” is really working super-well for me. I’m working on a project to create a semantic search module for the Drupal CMS, and did this little query demonstration which is coded in large part based on what I learned in this issue thread.

mirceacristea · May 16, 2024, 2:27pm

I understand that the standalone question is being used for similarity search but is not clear to me if that is the question used to actually answer or is it the original question.

I had mixt results using each. How should this be done?

SomebodySysop · May 16, 2024, 5:02pm

I create a “concept” question from the standalone question for similarity search (context retrieval). I then use standalone question + context documents to submit to LLM for response.

The way it should be done is the way that works best for you.

Now that we are getting much larger input context windows, I may start submitting the full chat history along with original question.

Topic		Replies	Views
OpenAI chatcompletion prompt for inferring user intent from chat history API gpt-4 , gpt-4o-mini	2	488	September 20, 2024
Context generation for chat based Q&A bot Prompting	41	22179	December 13, 2023
How to construct the prompt for a standalone question? API	2	7086	May 20, 2023
How to pass conversation history back to the API API chat-completion	14	42476	April 1, 2024
Context aware context for follow-up question Prompting embeddings , gpt-4 , api	13	9147	October 16, 2024

Chat Completion Architechture

Related topics