Send files to completion api

Hello,
I am trying to send files to the chat completion api but having a hard time finding a way to do so. I have seen some suggestions to use langchain but I would like to do it natively with the openai sdk.
Any tips on how to do that?
Thank you

Trying to do something along these lines

from openai import OpenAI 
client = OpenAI() 
file = client.files.create( file=open("file.pdf", "rb"), purpose="fine-tune" ) 
client = OpenAI() completion = client.chat.completions.create( model="gpt-4-1106",
        messages=[ {"role": "system", "content": "You are a helpful assistant that can read PDFs."}, 
                   {"role": "user", "content": f"Extract the text from the 3rd page from {file.id}"} ] ) 
print(completion.choices[0].message) 

someone mentioned the following code that works but want to understand what is happening under the hood

from langchain.document_loaders import PyPDFLoader
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain

#Locate your PDF here.
pdf="<YOUR_PDF_GOES_HERE>"
#Load the PDF
loader = PyPDFLoader(pdf)
documents = loader.load()

api_key = "sk-?????"
llm = OpenAI(openai_api_key=api_key)
chain = load_qa_chain(llm,verbose=True)
question = input("Enter your question here : ")
response = chain.run(input_documents=documents, question=question)
print(response) 

To do it natively, you’d need to use OpenAI’s assistants api instead of the Chat Completion api. See here.

The files can be used by tools such as Code Interpreter or Retrieval.

Code Interpreter can parse data from files. This is useful when you want to provide a large volume of data to the Assistant or allow your users to upload their own files for analysis.

Similar to Code Interpreter, files can be passed at the Assistant-level or individual Message-level.

In the langchain example you shared, it’s doing Retrieval Augmented Generation which is similar to what the Retrieval in the assistants api. In langchain, the contents of the pdf files are parsed out and that text is added to the prompt. In the assistant api, this is handled for you.

2 Likes

Alexx, did you ever find a way? I want to do, I think, the same thing. I want to ask some OpenAI API to summarize the content of a report that I have in a PDF/DOCX/etc. file. After reading about it’s ability to take a file, I had started down the ASSISTANTS API path. But after into I see that this leads to Retrieval which give the assistant more areas specific “knowledge” on which to base its answers … NOT what I want to do. I want a summarize of the content of my file. I am now looking at OpenAI Completions, but so far, I find it to be asking for me to, e.g., parse the file and pass in paragraphs.

Frustrated with not finding an OpenAPI by which I could get a summary of a file, I decided to try the Playground - Assistants. I uploaded a file, which required the gpt-3.5-turbo-1106 model. I then had say: “I need a summary of the report contained in file ”. It ran for quite a while, going through Create a thread, Run the thread, Run queued, Run in_progress and then Run expired 5894 tokens (5832 in, 62 out). I see no explanation for expired, but in the “Response” I do see “tool_calls” of “type”: “retrieval”. AND I don’t think that “retrieval” does what I need to do, i.e., summarize my report.

Retrieval is what you want, per the docs:

Once a file is uploaded and passed to the Assistant, OpenAI will automatically chunk your documents, index and store the embeddings, and implement vector search to retrieve relevant content to answer user queries.

Assistants tools - OpenAI API

If it’s expiring have you tried using a smaller document to test with that first, it could be the size of your document is taking a while to get embeddings for or otherwise handle.