Migration / Upgrade of RAG chatbot to Agents SDK using ResponsesAPI

lucmachine · June 19, 2025, 2:23am

Hi there, thanks for sharing this. I’m not using LangChain or the OpenAI Agents SDK in my RAG system. Instead, I built a fully custom stack. I describe the approach here:
Building first RAG system - #3 by lucmachine

I chose not to use LangChain or the SDK because:

I wanted full control over chunking, embedding, and metadata handling
My documents are complex (GHG protocols, legal texts), so I built regex-aware semantic chunkers
LangChain and the SDK add layers of abstraction that didn’t match my transparency and citation needs

I think that a solid chunking strategy is needed to get a better ChatBot. Some PDFs are really bad source documents. I’m also playing around with the idea of add TAGs to my chunks by having the LLM propose them.

Tags = “what is this about?”
Citation = “where did this come from?”

That said, I don’t have direct experience with the SDK upgrade path you’re on, but I do have strong prompting experience and wanted to offer a concept that might help.

Start of LLM Transmission: →

Suggestion: Bridge LangChain-style Citations into Agents SDK

If citations are central to your RAG chatbot, here’s a potential solution path:

1. Extract and Store Rich Metadata Per Chunk

If not already, include the following metadata in your vector database (e.g., Pinecone, pgvector):
doc_title, section_heading, chunk_index, source_url, and so on.

2. Add Metadata Into Retrieved Context

When your retrieval tool (whether from LangChain or SDK Tools API) returns relevant chunks, inject them into the context with metadata visible. For example:

[Source: NAME_OF_SOURCE_DOCUMENT.pdf | Section 3.2 | Page 12]  
Carbon credits must be tracked independently to avoid double counting...

3. Prompt the Model Explicitly to Preserve Citations

Add prompt instructions like:

“For each factual answer, include the citation of the document and section. Use the format:
[Source: {{doc_title}}, Section: {{section_heading}}, Page: {{page}}]”

This primes the LLM to include citations in the final response.

4. Middleware Post-Validation (Optional)

Optionally, check citations at generation time by matching the quoted passage against the original chunk using fuzzy matching or hashing.

4. Use LLM to Create Semantic Tags for Chunks

Tags help categorize chunks by topic, theme, or function (e.g. ["scope 1", "baseline", "reporting year", "offsets"]) and can be used to improve filtering, clustering, or search refinement.

A. During Ingestion:

For each chunk, before or after creating the embedding:

Send the chunk to the LLM with a prompt like:

“Given this text, return 3 to 5 concise tags that describe its main concepts or topics.”

Include something like:

json

CopyEdit

{
  "chunk_text": "Emissions from owned vehicles should be reported as Scope 1. Companies must include all mobile combustion sources in the base year inventory...",
  "response": ["scope 1", "mobile combustion", "base year"]
}

B. Store Those Tags as Metadata

Tags would go into a tags column in their vector DB (e.g., pgvector or Pinecone), as an array or JSON field.

C. Use Tags for Filtering or Grouping

When doing retrieval or displaying answers, they could:

Filter by tag (e.g. “only show Scope 1 content”)
Group search results by tag category
Show tags to the user as context (“This answer came from content tagged: Scope 1, baseline…”)

Tools to Use

Tool	Role
LangChain	Can wrap the LLM call to generate tags using `LLMChain` or `PromptTemplate`
OpenAI SDK	Can embed a tool or step in their ingestion pipeline to call GPT and label
Their own ingestion script	Can just use a single call to `openai.ChatCompletion.create()` and insert the result into their chunk metadata

End of LLM transmission

I hope this help?
Luc

Topic		Replies	Views
Request for Migration Guide to chatCompletions to Responses API using AgentsSDK & LangChain API	5	128	June 26, 2025
Fine-Tuning GPT-4o Mini for Bakery Chatbot with Function Calling API	6	260	April 4, 2025
How to build an AI system that can search over 50,000 documents with high accuracy? Community gpt-4 , fine-tuning , api , rag , assistants-api	7	601	June 16, 2025
Building first RAG system API	17	657	July 6, 2025
Switching from Assistants API to Chat Completion? API gpt-35-turbo , api , chat-completion , assistants-api	3	3292	March 6, 2024