Assistant API cant read my PDF.. How come?

Hey,

Somehow the Legacy Assistant API cannot read my PDF, which is very very strange.

I keep on getting the error.

These are my steps:

My PDF is just a normal PDF file. Dont know why I get it.

This is the response:

“value”:"The file contains binary data which couldn’t be decoded as text, indicating it might be a non-text-based file, such as an Excel or PDF document. I’ll try to read it as an Excel file next to see if this works

indicating it might be a non-text-based file, such as an Excel or PDF document

The answer is right there. For whatever reason it’s not able to read your PDF. It looks like you’re using Code Interpreter to try and open it?

PDFs are usually NOT text-based.

How you see the PDF is not how a computer sees it

If you want it to read a PDF, you can use the Vision API.

https://platform.openai.com/docs/guides/vision

Or you can deposit it and utilize in Retrieval

https://platform.openai.com/docs/actions/data-retrieval

You can also convert it to Markdown so the bot can read it better (Best option)
(This is the first website I saw for PDF → Markdown)

1 Like

Yes, I am using Code Interpreter to open it. Maybe thats the problem.

{
  "instructions": "You are a AI Business Auditor",
  "name": "AI Auditor",
  "tools": [
    {
      "type": "code_interpreter"
    }
  ],
  "model": "gpt-4-turbo"
}

maybe I should test it out with other options like:
code_interpreter , retrieval , or function

Btw the link regarding “data retrieval” seems complex.

I just want simple 2-3 PDFs which are max 5 pages long to be added as Context to my GPT assistant.

If you’re using Code Interpreter you’re not actually using GPT/Assistant services to read the PDF. You are passing the responsibility to a Python library to read it for you.

PDFs are not text, even if they are they aren’t easy to read. For simplicity it’s easier to think of them as images that sometimes have the text available on a row-by-row basis. Ever had difficulties copy and pasting text in a PDF? Highlighting gets all weird? Yeah. Shit is messy.

You need to use retrieval if your objective is to discuss the content.

Otherwise you can convert it to a text and then paste the content in the chat. This is what retrieval does in the background (with some additional work to hand-pick the (hopefully) most relevant entries).

The documentation is meant for GPTs, there’s definitely better guides found around. Don’t give up on it!

1 Like