Hi there, thanks for sharing this. I’m not using LangChain or the OpenAI Agents SDK in my RAG system. Instead, I built a fully custom stack. I describe the approach here:
Building first RAG system - #3 by lucmachine
I chose not to use LangChain or the SDK because:
- I wanted full control over chunking, embedding, and metadata handling
- My documents are complex (GHG protocols, legal texts), so I built regex-aware semantic chunkers
- LangChain and the SDK add layers of abstraction that didn’t match my transparency and citation needs
I think that a solid chunking strategy is needed to get a better ChatBot. Some PDFs are really bad source documents. I’m also playing around with the idea of add TAGs to my chunks by having the LLM propose them.
- Tags = “what is this about?”
- Citation = “where did this come from?”
That said, I don’t have direct experience with the SDK upgrade path you’re on, but I do have strong prompting experience and wanted to offer a concept that might help.
Start of LLM Transmission: →
Suggestion: Bridge LangChain-style Citations into Agents SDK
If citations are central to your RAG chatbot, here’s a potential solution path:
1. Extract and Store Rich Metadata Per Chunk
If not already, include the following metadata in your vector database (e.g., Pinecone, pgvector):
doc_title
, section_heading
, chunk_index
, source_url
, and so on.
2. Add Metadata Into Retrieved Context
When your retrieval tool (whether from LangChain or SDK Tools API) returns relevant chunks, inject them into the context with metadata visible. For example:
[Source: NAME_OF_SOURCE_DOCUMENT.pdf | Section 3.2 | Page 12]
Carbon credits must be tracked independently to avoid double counting...
3. Prompt the Model Explicitly to Preserve Citations
Add prompt instructions like:
“For each factual answer, include the citation of the document and section. Use the format:
[Source: {{doc_title}}, Section: {{section_heading}}, Page: {{page}}]”
This primes the LLM to include citations in the final response.
4. Middleware Post-Validation (Optional)
Optionally, check citations at generation time by matching the quoted passage against the original chunk using fuzzy matching or hashing.
4. Use LLM to Create Semantic Tags for Chunks
Tags help categorize chunks by topic, theme, or function (e.g. ["scope 1", "baseline", "reporting year", "offsets"]
) and can be used to improve filtering, clustering, or search refinement.
A. During Ingestion:
For each chunk, before or after creating the embedding:
- Send the chunk to the LLM with a prompt like:
“Given this text, return 3 to 5 concise tags that describe its main concepts or topics.”
Include something like:
json
CopyEdit
{
"chunk_text": "Emissions from owned vehicles should be reported as Scope 1. Companies must include all mobile combustion sources in the base year inventory...",
"response": ["scope 1", "mobile combustion", "base year"]
}
B. Store Those Tags as Metadata
Tags would go into a tags
column in their vector DB (e.g., pgvector or Pinecone), as an array or JSON field.
C. Use Tags for Filtering or Grouping
When doing retrieval or displaying answers, they could:
- Filter by tag (e.g. “only show Scope 1 content”)
- Group search results by tag category
- Show tags to the user as context (“This answer came from content tagged: Scope 1, baseline…”)
Tools to Use
Tool | Role |
---|---|
LangChain | Can wrap the LLM call to generate tags using LLMChain or PromptTemplate |
OpenAI SDK | Can embed a tool or step in their ingestion pipeline to call GPT and label |
Their own ingestion script | Can just use a single call to openai.ChatCompletion.create() and insert the result into their chunk metadata |
End of LLM transmission
I hope this help?
Luc