Upload a PDF file and ask questions about it, purpose = "assistants"

I am using file search to upload a PDF file and ask question about the document. Below is my code.

from openai import OpenAI
client = OpenAI()
assistant = client.beta.assistants.create(
  name="Analyst Assistant",
  instructions="You are an expert analyst in logistics. Use you knowledge base to answer questions about the deliver order.",
  model="gpt-4o",
  tools=[{"type": "file_search"}],
)

message_file = client.files.create(file=open("tmp/CBHU8970688--DO.pdf", "rb"), purpose="assistants")
thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": prompt_do_from_text,
      "attachments": [{ "file_id": message_file.id, "tools": [{"type": "file_search"}] }],
    }
  ]
)

run = client.beta.threads.runs.create_and_poll(thread_id=thread.id, assistant_id=assistant.id)
messages = list(client.beta.threads.messages.list(thread_id=thread.id, run_id=run.id))
print(messages[0].content[0].text.value)

This code works well. However, I think the purpose should not be “assistants”. I know the “answers” has been deprecated. I tested purpose = “responses” but it reports errors.

I think there should be a more proper way to upload a file and ask questions about it. Any suggestion?

“assistants” for purpose is not something the AI receives. It is to distinguish the file as for use with the Assistants endpoint, as opposed to files that you might upload for fine-tuning of models, or batches of jobs you might run overnight.

There’s a new undocumented user_data type for upload-only that I added to my utility to investigate - sometime when I’m more interested in the endpoint than in simply tweaking on code.

━━━━━━━━━━━━━━━━━━━━━━━━━━━◣
OpenAI File Storage Utility ▮▮▶ Select New File 'Purpose'
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬◤
[1] Assistants - docs for search or code interpreter (input)
[2] Assistants_output - files produced by assistant or code (output)
[3] User_data - New unknown purpose (input)
[4] Fine-tune - JSONL training file (input)
[5] Fine-tune-results - learning metrics report (output)
[6] Batch - JSONL list of API calls to run, results (input,output)
[7] Vision - Images for Assistants message attachment (input)
[8] Exit
◄ current purpose ► assistants
2 Likes

Thanks for the explanation!