I am in the middle of a poc to upgrade our v-1 rag chatbot using chatCompletions API to AgentsSDK with access to tools - vector search etc, however, I am wondering if there are tools to port the existing langchain implementation over that generates citations as well. citations are a huge part of our chatbot and I do not see an easy migration path. Further how do I keep my existing langchain implementation in the AgentsSDK ? Our v1 is live and in prod so the v2 has to have the same functionality at minimum. Appreciate any pointers from folks who are in the middle of an upgrade to the newer AgentsSDK for RAG chatbots.
Hi there, thanks for sharing this. I’m not using LangChain or the OpenAI Agents SDK in my RAG system. Instead, I built a fully custom stack. I describe the approach here:
Building first RAG system - #3 by lucmachine
I chose not to use LangChain or the SDK because:
- I wanted full control over chunking, embedding, and metadata handling
- My documents are complex (GHG protocols, legal texts), so I built regex-aware semantic chunkers
- LangChain and the SDK add layers of abstraction that didn’t match my transparency and citation needs
I think that a solid chunking strategy is needed to get a better ChatBot. Some PDFs are really bad source documents. I’m also playing around with the idea of add TAGs to my chunks by having the LLM propose them.
- Tags = “what is this about?”
- Citation = “where did this come from?”
That said, I don’t have direct experience with the SDK upgrade path you’re on, but I do have strong prompting experience and wanted to offer a concept that might help.
Start of LLM Transmission: →
Suggestion: Bridge LangChain-style Citations into Agents SDK
If citations are central to your RAG chatbot, here’s a potential solution path:
1. Extract and Store Rich Metadata Per Chunk
If not already, include the following metadata in your vector database (e.g., Pinecone, pgvector):
doc_title
, section_heading
, chunk_index
, source_url
, and so on.
2. Add Metadata Into Retrieved Context
When your retrieval tool (whether from LangChain or SDK Tools API) returns relevant chunks, inject them into the context with metadata visible. For example:
[Source: NAME_OF_SOURCE_DOCUMENT.pdf | Section 3.2 | Page 12]
Carbon credits must be tracked independently to avoid double counting...
3. Prompt the Model Explicitly to Preserve Citations
Add prompt instructions like:
“For each factual answer, include the citation of the document and section. Use the format:
[Source: {{doc_title}}, Section: {{section_heading}}, Page: {{page}}]”
This primes the LLM to include citations in the final response.
4. Middleware Post-Validation (Optional)
Optionally, check citations at generation time by matching the quoted passage against the original chunk using fuzzy matching or hashing.
4. Use LLM to Create Semantic Tags for Chunks
Tags help categorize chunks by topic, theme, or function (e.g. ["scope 1", "baseline", "reporting year", "offsets"]
) and can be used to improve filtering, clustering, or search refinement.
A. During Ingestion:
For each chunk, before or after creating the embedding:
- Send the chunk to the LLM with a prompt like:
“Given this text, return 3 to 5 concise tags that describe its main concepts or topics.”
Include something like:
json
CopyEdit
{
"chunk_text": "Emissions from owned vehicles should be reported as Scope 1. Companies must include all mobile combustion sources in the base year inventory...",
"response": ["scope 1", "mobile combustion", "base year"]
}
B. Store Those Tags as Metadata
Tags would go into a tags
column in their vector DB (e.g., pgvector or Pinecone), as an array or JSON field.
C. Use Tags for Filtering or Grouping
When doing retrieval or displaying answers, they could:
- Filter by tag (e.g. “only show Scope 1 content”)
- Group search results by tag category
- Show tags to the user as context (“This answer came from content tagged: Scope 1, baseline…”)
Tools to Use
Tool | Role |
---|---|
LangChain | Can wrap the LLM call to generate tags using LLMChain or PromptTemplate |
OpenAI SDK | Can embed a tool or step in their ingestion pipeline to call GPT and label |
Their own ingestion script | Can just use a single call to openai.ChatCompletion.create() and insert the result into their chunk metadata |
End of LLM transmission
I hope this help?
Luc
@lucmachine - Thank you for sharing the details, we have all the functionality in place , today we generate citations using the Langchain’s runnable over prompt engineering - the former is a slightly more deterministic as we feed each doc with the metadata that matters to us. My question is really how do I “upgrade” this to the newer AgentSDK framework , very primitively, I will have to convert everything to a tool (aka function call) and run evals to perform regression testing. Perhaps what I am looking for in the long run is OOB “tools” for generating citations, memory/state management and streaming. From what I gather, the ResponsesAPI does this all natively which is a huge change over the existing chatCompletionsAPI. For reference, here are my next steps in this migration poc -
- Create right tools - vector search, citations generation, history for wrapping existing langchain functions
- Orchestrate the knowledge bot using the AgentsSDK and tools