Chat completions api for attach a pdf

how to do this openai api for attachment document as pdf and prompt. or how to handle File files OpenAI API

Hey, did you find any way around for this?

There are couple of different options:

If you are just looking to use the “basic” API, then you would need to first programmatically extract text from a file (e.g. in Python a library such as PyPDF2 or PDFMiner) and then add the full or part of the text to your prompt, e.g. as part of the user message.

Alternatively, you might want to look into OpenAI’s Assistant API, which allows you to directly submit files. You can read up more details on Assistants here and the file search capability specifically here.

Hi @jr.2509,

I want to upload pdf file conatning image, text & source code & prompt as well

i am reading a pdf file from streamlit then converting it into base64 but I am not sure if its reading those images as well.

def convert_pdf_to_base64(file_path):
    """
    Converts a file to base64 encoding.

    Args:
        file_path (str): The path to the file.

    Returns:
        str: The base64 encoded string of the file.
    """
    with open(file_path, "rb") as pdf_file:
        encoded_string = base64.b64encode(pdf_file.read())
    return encoded_string

uploaded_file = st.file_uploader("Choose a pdf/doc file",
                                     type=["pdf"])

if uploaded_file is not None:
        # read the doc/pdf file & reference file
        save_path = Path(uploaded_file.name)
        with open(save_path, mode='wb') as w:
            w.write(uploaded_file.getvalue())

        if submitted:
            with st.spinner("Please wait..."):

                pdf_data = convert_pdf_to_base64(save_path)
                base_prompt = "generate test scripts & automation script for given data"
                chat_completion = client.chat.completions.create(
                    messages=[
                        {"role": "system", "content": base_prompt},
                        {
                            "role": "user",
                            "content": (
                                f"{prompt}\nBase64 representation of pdf: {pdf_data}"[: 100000]
                            ),
                        },
                    ],
                    model="gpt-4o",
                )
            print(chat_completion.choices[0].message.content)

Also, it takes around 100000 tokens so I am truncating but that’s a big disadvantage for me.

Is there any method to do this better?

The model won’t be able to process the PDF’s content if you provide it as base64 encoded data.

It’s generally possible to provide the model with base 64 encoded images but the same does not apply to text.

thanks for the info @jr.2509
Is there any solution for this to upload a pdf containing text & images?