Adding PDF in the assistant API input

pathikghugare · November 14, 2023, 8:39pm

I am creating an assistant for doing information extraction from PDFs
I tried using the assistant UI playground on the OpenAI platform and it worked pretty well
Now I am trying to do the same using OpenAI python SDK v1.2

Here’s my code:


with open('instruction.txt') as f:
    instructions = f.read()
    
assistant = client.beta.assistants.create()
    name="assistant",
    instructions=instructions,
    tools=[{"type": "code_interpreter"}],
    model="gpt-4-1106-preview"
)

file = client.files.create()
file = open("file1.pdf", "rb"),
  purpose='assistants'
)
page_num = 1
thread = client.beta.threads.create()

message = client.beta.threads.messages.create()
    thread_id=thread.id,
    role="user",
    content=f"extract for page {page_num}, (print all page text)",
    file_ids=[file.id]
)

run = client.beta.threads.runs.create()
  thread_id=thread.id,
  assistant_id=assistant.id,
instructions:"If there's no page, return a 'END' in the json response."
)

start_time = time.time()
while run.status!= "completed":
    run = client.beta.threads.runs.retrieve(
        thread_id=thread.id,
        run_id=run.id
    )
    if run.status in ["failed", "cancelled", "expired", "requires_action"]:
        print(f"run failed: {run.last_error}")
        break

end_time = time.time()

messages = client.beta.threads.messages.list(
    thread_id=thread.id
)

print(messages)

but in the output, I receive the following:

[ThreadMessage(id='msg_dHNi9zzDWoSDfBbFIC6wfHlp', assistant_id='asst_86coHdMlTjRdeYYjABp61x1s', content=[MessageContentText(text=Text(annotations=[], value="To assist you further, could you please provide more details about the uploaded file? Specifically, it would be helpful to know the type of file you've uploaded (e.g., PDF, Word document, text file, etc.) and what content you're expecting to extract from page 1."), type='text')]created_at=1699994179, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_5z7ScQh4f8xBZjylwqLP7CcR', thread_id='thread_hXk19MpSMNZwvpGY8GH9LQZE'), ThreadMessage(id='msg_tkdWWQvIDt8KMgUk1YJdMhCv', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='extract for page 1, (print all page text)'), type='text')], created_at=1699994178, file_ids=['file-xCZr5vMxS1jc6vm4trWIbR5y'], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_hXk19MpSMNZwvpGY8GH9LQZE')]

So the response I got from the model was:

To assist you further, could you please provide more details about the uploaded file?
Specifically, it would be helpful to know the type of file you've uploaded (e.g., PDF, Word document, text file, etc.) and what content you're expecting to extract from page 1."

When I tried the same using playground, I did not receive any similar messages

What changes do I need to make in the above code to make it read the file and do the OCR as it usually does while using code_interpreter in the playground?

pathikghugare · November 15, 2023, 7:08am

Specifying its a PDF in the message content worked

execublar · November 24, 2023, 2:22pm

This is very ridiculous. openai erases file type suffixes when uploading files, and requires developers to manually specify file types when using assistant to read files.
I hope openai must fix this problem in next version.

Topic		Replies	Views
Assistant API cant read my PDF.. How come? API api	4	2468	July 20, 2024
Assistant API system files should not be exposed to the user + PDF file parsing is intermittently buggy Feedback api	6	561	March 25, 2024
How can I make the assistant 'read' scanned documents that are in PDF format? API assistants-api , file-uploads	3	183	June 2, 2025
Upload a PDF file and ask questions about it, purpose = "assistants" API api , assistants-api , file-uploads	2	3321	July 5, 2024
Assistant can't access the file I am giving it API gpt-4-turbo , assistants-api	2	2087	December 19, 2023

Adding PDF in the assistant API input

Related topics