Help Needed: Build Chat Assistant Using OpenAI + Next.js App Router + Local Docs in PDF/MDX Format

Hi everyone,

I’m working on building a custom AI-powered chat assistant that can answer user questions based on product documentation files that I maintain.

Here’s what I have:

  • A large number of product-specific documents stored in a directory.Example:
/docs/
  ├── A.pdf
  ├── B.mdx
  ├── C.pdf
  ├── D.mdx
  • Each file corresponds to a unique product (e.g., A, B, C, D).
  • I want the assistant to answer queries like:
    • “B is showing an error, what should I do?”
    • “How do I configure product A?”
    • “What’s the installation process for D?”
  • The assistant should search and understand the content of these documents and provide relevant, accurate answers.

My tech stack:

  • Next.js (v15+ App Router)
  • OpenAI GPT-4 API
  • Documents are in .pdf and .mdx formats

What I’m trying to achieve:

  1. Ingest the documentation (PDF or MDX) into a searchable format (embeddings, vector DB, etc.).
  2. Connect OpenAI GPT-4 with a retrieval mechanism so the model can reference the documents.
  3. Provide a frontend chat interface where users can ask natural language questions.
  4. Build everything within a Next.js App Router project (not pages directory).

Where I need help:

  • What’s the best approach to process and chunk PDF + MDX content for embedding?
  • What’s a good vector database that works well with local files and integrates nicely with Next.js (e.g., pgvector, Qdrant, etc.)?
  • How should I structure my app in App Router? (e.g., server actions, API routes, Vercel AI SDK?)
  • Any open-source templates or examples that come close to this use case?
  • Best practices to optimize response quality and context relevance when using OpenAI + RAG setup?

If anyone has done something similar or can point me to solid tutorials/examples, I’d greatly appreciate it!

Thanks in advance :folded_hands:

If you’re working with Next.js App Router and want full server control, you might want to look into using Server Actions for ingestion tasks and API routes for the retrieval/chat endpoint. Vercel AI SDK is also super handy—it abstracts a lot for streaming and message formatting.

Yes, exactly! I’ve been trying to go that route—using Server Actions for ingestion and API routes for retrieval, along with the Vercel AI SDK for the chat UI.

However, the documentation and example projects I’ve come across so far are pretty incomplete. When I try to set everything up locally, I run into tons of errors—missing configurations, unclear dependencies, or incomplete setup steps. It feels like a lot of important steps are either skipped or assumed.

It would be super helpful if there was a step-by-step guide or a working example repo that shows everything from installing dependencies, configuring vector storage (like pgvector or similar), parsing PDFs/MDX, to wiring it all up with the Vercel AI SDK and OpenAI API.

If you (or anyone else here) knows of a solid full example—or if you’ve built something similar—please share it! I think a lot of people would benefit from a complete walkthrough.

Thanks again!




LMK if you want the clean version of what solves your problem