What is the best type of format to use for uploaded documents for GPTs?

Does anyone have any advice on the best formats to use when uploading files to GPTs? Does the GPT prefer to use files in markdown, PDFs, JSON, etc? What is the best for retrieval when answering questions?

I want to create my own custom GPT to answer questions I have about various software topics so it can help me build my app. Currently, I’m feeding it PDFs from markdown files, but wondering if that is unnecessary conversion.


Hi and welcome to the Developer Forum!

Whatever is the most common in the training data, given the data is from the internet, one can assume json, markdown, html and PDF’s are all pretty well covered, although a PDF has to be translated to text prior to processing so I’d avoid any non pure text native formats.

I have the same question. I’ve been converting my PDFs to plain text as that seems to be a bit more efficient when it comes to indexing and processing uploading documentation.

1 Like