Should I use completion or chat endpoint when I make chatbot with custom data using vector database?

I make chat with pdf and I use Pinecone to store pdf’s texts with there embeddings and and to make semantic search to retrieve some texts to make gpt model generate answer based on those texts + user’s input.

So should I use chat endpoint or completion endpoint?
If I used chat endpoint how to feed gpt model with the texts retrieved from Pinecone?
If I used completion endpoint so how to structure the prompt? I mean I have seen some suggestions that start with a fixed text tell the model that the following is the contextual information and the previous conversation, but I wondering if this fixed text might get cut when the tokens exceed the max?

The endpoint depends upon which model you are using.

The completion endpoint is used with older models which are going to be deprecated in the near future but the suggestion you have mentioned is correct and also valid for the chat endpoint as well.

While the 3.5 model and above use the chat.create endpoint, their behaviour in essence is similar in the sense that given an input about the contextual information and the previous conversation, they will continue the chat based on what information you provided the system message

With chat.create endpoint there is parameter for conversation but there is no parameter for contextual text retrieved from the vector database.

so how to deal with this contextual text?

You include your additional context in the system or user prompt. This is LangChains prompt:

System: Use the following pieces of context to answer the users question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
<context>

Human: <users question>

This mean that the array of messages that I send with every call will include one message containing (“role”: “system”), and the other messages contain “role”: “user” and “role”: “assistant”. So It will be something like this:

"messages": [
        {
            "role": "system",
            "content": "Use the following pieces of context to answer the users question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
<context>"
        },
        {
            "role": "user",
            "content": <Hello!>
        },
        {
            "role": "assistant",
            "content": "How can I help you today?"
        },
        {
            "role": "user",
            "content": <What is the AI?>
        },
        {
            "role": "assistant",
            "content": "The AI is the artificial intelligence"
        },
        {
            "role": "user",
            "content": <What it used for>
        }
    ]

Right?

Yes, potentially. You can also experiment with including the relevant context with the user message along with their question (older models didn’t handle system message as well). Likely want to only include the context for the latest question to minimize tokens/confusion. Keep in mind you can change the “history” of the conversation however you want, and that the request you make to OpenAI doesn’t need to 1-for-1 mirror your chat UI. Takes some experimentation with your use-case and running through realistic examples to see what produces the most relevant results with your data.

1 Like

You can also just insert, but not record in chat history due to size, a user or assistant message before the latest question.

system: You are a sports bot. You have a knowledge base of sports statistics.

user: Hi, do you know about hockey?

assistant: Sure, ask away.

assistant: Retrieved from knowledge base for user:
Wayne Gretzky - Known as “The Great One,” Wayne Gretzky is widely considered one of the greatest hockey players of all time. He holds the record for the most power play goals in NHL history.
Dave Andreychuk - Dave Andreychuk had a remarkable career as a power forward and is known for his prolific scoring on the power play. He ranks second in all-time power play goals.
Brett Hull - Brett Hull, the son of hockey legend Bobby Hull, was a prolific goal scorer and had an incredible knack for scoring on the power play. He is third in all-time power play goals.

user: who are the top three all-time leaders in power play goals?

1 Like

Hi Salemmo, I was wondering if were you able to do it?

As mentioned in the answers above, use the chat completion endpoint then you can insert the custom contextual text with the content of the “system” role, with the content of the “assistant” role, or with the “user” role. I inserted it with the content of the “system” role.

But instead of that, now you may consider building an assistant that can be fed with the file with no need for embeddings.