Assistant API cant read my PDF.. How come?

lora · June 8, 2024, 7:23pm

Hey,

Somehow the Legacy Assistant API cannot read my PDF, which is very very strange.

I keep on getting the error.

These are my steps:

Create Assistant: POST https://api.openai.com/v1/assistants
Attach a PDF File to the Assistant: POST https://api.openai.com/v1/assistants/{assistant_id}/files
Creating a thread with message and running it. https://api.openai.com/v1/threads/runs

My PDF is just a normal PDF file. Dont know why I get it.

This is the response:

“value”:"The file contains binary data which couldn’t be decoded as text, indicating it might be a non-text-based file, such as an Excel or PDF document. I’ll try to read it as an Excel file next to see if this works

RonaldGRuckus · June 8, 2024, 8:49pm

indicating it might be a non-text-based file, such as an Excel or PDF document

The answer is right there. For whatever reason it’s not able to read your PDF. It looks like you’re using Code Interpreter to try and open it?

PDFs are usually NOT text-based.

How you see the PDF is not how a computer sees it

If you want it to read a PDF, you can use the Vision API.

https://platform.openai.com/docs/guides/vision

Or you can deposit it and utilize in Retrieval

https://platform.openai.com/docs/actions/data-retrieval

You can also convert it to Markdown so the bot can read it better (Best option)
(This is the first website I saw for PDF → Markdown)

lora · June 9, 2024, 10:48am

Yes, I am using Code Interpreter to open it. Maybe thats the problem.

{
  "instructions": "You are a AI Business Auditor",
  "name": "AI Auditor",
  "tools": [
    {
      "type": "code_interpreter"
    }
  ],
  "model": "gpt-4-turbo"
}

maybe I should test it out with other options like:
code_interpreter , retrieval , or function

lora · June 9, 2024, 10:52am

Btw the link regarding “data retrieval” seems complex.

I just want simple 2-3 PDFs which are max 5 pages long to be added as Context to my GPT assistant.

chrisder54 · July 20, 2024, 5:02pm

you give it a PDF you mean when you build a custom GPT with the database?

Topic		Replies	Views
Adding PDF in the assistant API input API gpt-4 , assistants , assistants-api	2	8709	November 24, 2023
Retriever Assistant can't read scanned pdfs? API gpt-4 , api	7	2790	July 22, 2024
Assistant api retriever sometimes cannot read pdf API gpt-4 , api	5	1918	November 29, 2023
Assistant API system files should not be exposed to the user + PDF file parsing is intermittently buggy Feedback api	6	533	March 25, 2024
Problems with recognising and reading file formats Prompting gpt-4 , api , assistants-api	7	649	April 3, 2024

Assistant API cant read my PDF.. How come?

Related topics