Pdf file can be interpreted by assistant in the playground but not via code

Hi,

I face the following problem:
I saved a single .pdf file in the Storage. When using the Assistants playground and ask questions about the file, I get decent answers.

using the python API, I get the response that the file cannot be accessed in most of the cases. Sometimes I get a decent answer .

This discrepancy between the results from the playground and the API is big.

I would highly appreciate it if you could tell me whether I have introduced a fundamental logical problem into the code. I did not do anything fancy, but used the most basic assistant example from github, rewrote the message and added tool_resources (openai-python/examples/assistant.py at main · openai/openai-python · GitHub).

Thanks in advance

from openai import OpenAI
from credentials import API_KEY

file_id = 'i like turtles'
client = OpenAI(api_key=API_KEY)

assistant = client.beta.assistants.create(
    name="DocSummarizer",
    description=f"You are a consultant with a strong focus on detail. ",
    tools=[{"type": "code_interpreter"}],
    tool_resources={"code_interpreter": {
        "file_ids": [file_id]
    }},
    model="gpt-4o",
)

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content=f"please summarize the pdf document with the id: {file_id}.",
)

run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="please look very carefully at the document and the message and create an output exactly as specified.",
)

print("Run completed with status: " + run.status)

if run.status == "completed":
    messages = client.beta.threads.messages.list(thread_id=thread.id)

    print("messages: ")
    output = []
    for message in messages:
        assert message.content[0].type == "text"
        output.append({"role": message.role, "message": message.content[0].text.value})
        
    client.beta.assistants.delete(assistant.id)

Hi and welcome to the community!

For debugging purposes you can try specifying that the file is a PDF in your Python version of the code as described here.

I also suggest to look through the related topics listed below.

Hope this helps!

Hi,

thanks for the quick replay. I did check the example and the related topics. As you can see in my question above, I already indicated in the message that the file is a pdf. Indeed, I did not specify it in the instructions. But even by doing this

instructions="please look very carefully at the pdf document and the message and create an output exactly as specified.",

I constantly get this answer:
“It seems that the text extraction from the PDF didn’t result in any readable text, possibly due to the nature of the PDF content”

Is it correct that I only specify the tool_resources in the assistant?
I don’t understand why this is working in the playground but not via the API. Are they not sharing the same resources?

Take a look at the write-up below.
I am under the impression that you need to change the way how you are processing the PDF file before passing it to the assistant.

Hello. I have found the issue here, which I quoted.

The problem is that you attached a file to the code interpreter for running Python.

What you need to do is connect a vector store ID to the file_search tool.

A summary task, of reading a whole PDF, can only be done on small files, because a vector store doesn’t return the entire text. It only searches your vector store for the best matching chunk sections, from all documents.

You can read API documentation about the whole procedure for vector stores, which the playground unfortunately makes easy.

3 Likes

Hi,

thanks for the hints. Helped me a lot.

I digged deeper into the topic of storing files and managing them in the vector store. Doing that, I found an interesting behavior. Not sure if this is expected.

A high level description of what I am trying to do is:
I want to give the assistant a dictionary of keys and values and the assistant should check if it finds the combination of key and value in a .pdf document. I have something like 200 documents.

Since the vector store automatically chunks my documents, I at first tried to prevent to digg into the topic and did therefore the following (very high level):

vector_store = VectorStore()
assistant = Assistant()
for file in files:
    clean vecttor_store if not empty
    upload file to vector_store
    response = assistant.ask_question()

when I did this the assistant sometimes gave answers that reflected the data from a file that was earlier before in the vector store. It seemed like the cleaning was not “immediate”.

To get rid of the behavior, I had to do the following (again high level)

assistant = Assistant()
for file in files:
    vector_store = VectorStore()
    clean vecttor_store if not empty
    upload file to vector_store
    response = assistant.ask_question()

basically using a new vector store all the time

1 Like

Is it possible that the previous knowledge was still part of the context send to the assistant and that this context was also reset with the new vector store?

1 Like

let me update my low code from above to indicate where I pass the vector store and where I update the assistant with it. Please note that I only indicate variables passed to assistant and assistant.update() that are necessary for my explanation.

one instance of vector store

vector_store = VectorStore()
assistant = Assistant(
    tools=[{"type": "file_search"}],
    tool_resources={"file_search": {
        "vector_store_ids": [vector_store.id]
    }}
)
for file in files:
    clean vecttor_store if not empty
    upload file to vector_store
    response = assistant.ask_question()

multiple instances of vector store

assistant = Assistant()
for file in files:
    vector_store = VectorStore()

    assistant.update(
        tools=[{"type": "file_search"}],
        tool_resources={"file_search": {
            "vector_store_ids": [vector_store.id]
        }}
    )
    upload file to vector_store
    response = assistant.ask_question()
1 Like