Help Needed: Build Chat Assistant Using OpenAI + Next.js App Router + Local Docs in PDF/MDX Format

shabbir.3s.et · May 19, 2025, 7:21am

Hi everyone,

I’m working on building a custom AI-powered chat assistant that can answer user questions based on product documentation files that I maintain.

Here’s what I have:

A large number of product-specific documents stored in a directory.Example:

/docs/
  ├── A.pdf
  ├── B.mdx
  ├── C.pdf
  ├── D.mdx

Each file corresponds to a unique product (e.g., A, B, C, D).
I want the assistant to answer queries like:
- “B is showing an error, what should I do?”
- “How do I configure product A?”
- “What’s the installation process for D?”
The assistant should search and understand the content of these documents and provide relevant, accurate answers.

My tech stack:

Next.js (v15+ App Router)
OpenAI GPT-4 API
Documents are in .pdf and .mdx formats

What I’m trying to achieve:

Ingest the documentation (PDF or MDX) into a searchable format (embeddings, vector DB, etc.).
Connect OpenAI GPT-4 with a retrieval mechanism so the model can reference the documents.
Provide a frontend chat interface where users can ask natural language questions.
Build everything within a Next.js App Router project (not pages directory).

Where I need help:

What’s the best approach to process and chunk PDF + MDX content for embedding?
What’s a good vector database that works well with local files and integrates nicely with Next.js (e.g., pgvector, Qdrant, etc.)?
How should I structure my app in App Router? (e.g., server actions, API routes, Vercel AI SDK?)
Any open-source templates or examples that come close to this use case?
Best practices to optimize response quality and context relevance when using OpenAI + RAG setup?

If anyone has done something similar or can point me to solid tutorials/examples, I’d greatly appreciate it!

Thanks in advance

nandla · May 19, 2025, 7:25am

If you’re working with Next.js App Router and want full server control, you might want to look into using Server Actions for ingestion tasks and API routes for the retrieval/chat endpoint. Vercel AI SDK is also super handy—it abstracts a lot for streaming and message formatting.

shabbir.3s.et · May 19, 2025, 7:33am

Yes, exactly! I’ve been trying to go that route—using Server Actions for ingestion and API routes for retrieval, along with the Vercel AI SDK for the chat UI.

However, the documentation and example projects I’ve come across so far are pretty incomplete. When I try to set everything up locally, I run into tons of errors—missing configurations, unclear dependencies, or incomplete setup steps. It feels like a lot of important steps are either skipped or assumed.

It would be super helpful if there was a step-by-step guide or a working example repo that shows everything from installing dependencies, configuring vector storage (like pgvector or similar), parsing PDFs/MDX, to wiring it all up with the Vercel AI SDK and OpenAI API.

If you (or anyone else here) knows of a solid full example—or if you’ve built something similar—please share it! I think a lot of people would benefit from a complete walkthrough.

Thanks again!

dmitryrichard · May 19, 2025, 12:05pm

LMK if you want the clean version of what solves your problem

Topic		Replies	Views
Using large PDFs to make a ChatBot API chatgpt , api	21	6811	December 15, 2023
How to build an AI system that can search over 50,000 documents with high accuracy? Community gpt-4 , fine-tuning , api , rag , assistants-api	7	2090	June 16, 2025
Creating a bot using 100+ PDFS as the knowledge base API	19	17322	August 15, 2024
Leveraging LLMs with Vast Mechanic Datasets and Guides API api	6	2872	August 31, 2023
Feedback please: Chatbot to answer questions about long documents API	4	2342	December 17, 2023

Help Needed: Build Chat Assistant Using OpenAI + Next.js App Router + Local Docs in PDF/MDX Format

Here’s what I have:

My tech stack:

What I’m trying to achieve:

Where I need help:

Related topics