RAG needs more detail source information

gyutae.q · November 7, 2023, 5:59am

As I read in document, now assistant chatbot have retrival system. I’ve tried to use them, but they didn’t find which file and which part they referenced. I tested this gpt-4-turbo and new gpt-3.5 both of them didn’t know which file they used. What I expected was, file name and file page. I hope there is setup option how to split the corpus, and metadata.

peacedude · November 7, 2023, 7:09am

Great question. To include metadata like file names and page numbers, you can format system messages or custom prompts with specific instructions for the model. This might involve defining placeholders within the prompt that tell the model to look for and include this metadata in its response. For example, a system message could be:

“Retrieve the following details: [file name], [page number].”

In the context of function calls, these are a feature of the OpenAI API that allows you to execute a predefined function as part of the model’s response. You can create a function that, when called, accesses the document’s metadata and returns the file name and page number alongside the response. The documentation provides information on how to structure these calls and integrate them into your application.

It’s important to note that function calls would need to be tailored to your specific data structure and retrieval needs. They would also require the metadata to be structured and accessible in a way that the function can reliably retrieve it.

Here is a generic example of how a system message with placeholders for metadata might look:

{
“system”: “fetch_document_details”,
“data”: {
“file_name_placeholder”: “{file_name}”,
“page_number_placeholder”: “{page_number}”
}
}

For a function call, you might define a function in your application that accesses and retrieves file metadata. When the chatbot needs to provide this information, it would issue a function call within the prompt. Here’s a simplified pseudo-code example:

def get_file_metadata(document_id):
# Placeholder function to get file metadata based on the document ID
metadata = database.get_document_metadata(document_id)
return {
“file_name”: metadata.file_name,
“page_number”: metadata.page_number
}

Example of calling the function in the chatbot prompt

metadata = get_file_metadata(‘doc123’)
response = chatbot.prompt(f"File name: {metadata[‘file_name’]}, Page: {metadata[‘page_number’]}")

This is a high-level example and would need to be adjusted for the specific programming language and data access methods that you are using.

I hope this helps. All the best

gyutae.q · November 7, 2023, 8:43am

peacedude:

Great question. To include metadata like file names and page numbers, you can format system messages or custom prompts with specific instructions for the model. This might involve defining placeholders within the prompt that tell the model to look for and include this metadata in its response. For example, a system message could be:

“Retrieve the following details: [file name], [page number].”

In the context of function calls, these are a feature of the OpenAI API that allows you to execute a predefined function as part of the model’s response. You can create a function that, when called, accesses the document’s metadata and returns the file name and page number alongside the response. The documentation provides information on how to structure these calls and integrate them into your application.

It’s important to note that function calls would need to be tailored to your specific data structure and retrieval needs. They would also require the metadata to be structured and accessible in a way that the function can reliably retrieve it.

Here is a generic example of how a system message with placeholders for metadata might look:

{
“system”: “fetch_document_details”,
“data”: {
“file_name_placeholder”: “{file_name}”,
“page_number_placeholder”: “{page_number}”
}
}

For a function call, you might define a function in your application that accesses and retrieves file metadata. When the chatbot needs to provide this information, it would issue a function call within the prompt. Here’s a simplified pseudo-code example:

def get_file_metadata(document_id):

Placeholder function to get file metadata based on the document ID

metadata = database.get_document_metadata(document_id)
return {
“file_name”: metadata.file_name,
“page_number”: metadata.page_number
}

Example of calling the function in the chatbot prompt

metadata = get_file_metadata(‘doc123’)
response = chatbot.prompt(f"File name: {metadata[‘file_name’]}, Page: {metadata[‘page_number’]}")

This is a high-level example and would need to be adjusted for the specific programming language and data access methods that you are using.

I hope this helps. All the best

Oh cool, Where can I find more about RAG setup information?

peacedude · November 7, 2023, 10:30am

Happy it helped. The OpenAI documentation does not go into RAG configuration parameters for developers…yet (cross my fingers) so I believe that for now this is all handled by OpenAI. The OpenAI documentation is really good and will get better as more features, projects and use cases roll out OpenAI Platform

gyutae.q · November 8, 2023, 11:04am

Thanks for your kind explain, however it doesn’t work when I put like this. It doesn’t give me source information that I wanted.
And anyone have information how openai price when they retrieve, I couldn’t find any information how they use RAG, exactly what prompt was inputted even I checked run step result.

peacedude · November 25, 2023, 2:01pm

I did do some further checking and RAG (Called Retrieval) setup is in fact handled automatically after you select retrieval and upload your files to OpenAI if you are building and assistant in the OpenAI playground type UI.

If you are buiding a GPT in ChatGPT Plus the setup is similar with regard to selecting the Retrieval (RAG) check box. You can also use the Tool Retrieval code if you are hard coding your application. I have attached a screenshot of how to set it up in OpenAi as well as a link to the code. All the best with your application! OpenAi Tools Retrieval|397x203
https://platform.openai.com/docs/assistants/tools/knowledge-retrieval

victory_1128 · January 24, 2024, 8:51am

Click logs in the upper right corner to open the log window and there will be many detailed execution steps.It depends on whether it has been completed, and it seems that the running results must be seen in the return results on the right?

Topic		Replies	Views
Assistant - how to know if retrieval is being used? Prompting gpt-4 , assistants , assistants-api	6	1893	December 31, 2023
Assistant Retrieval method and RAG (are they doing same?) API codex , gpt-4 , gpt-35-turbo , chatgpt , api	7	7426	January 3, 2025
Does Assistant API's file search tool use RAG by default? API assistants-api , file-search	2	435	November 20, 2024
Optimal instructions to get Assistant with Retrieval Tool to Return all the Relevant Results Prompting gpt-4 , rag , assistants	7	8430	February 10, 2024
How to use RAG properly and what types of query it is good at? GPT builders chatgpt	8	17651	June 17, 2024

RAG needs more detail source information

Example of calling the function in the chatbot prompt

Related topics