Unknown Error Occurred when uploading PDF

If you ever hit a situation with PDF’s, there’s always the brute force OCR route to extract data from them. But the zip file trick and the direct upload to ChatGPT just didn’t work for me either.

Here is brute force OCR code to get the contents:

import pytesseract
from PIL import Image
from pdf2image import convert_from_path
# brew install poppler (on Mac)
import os

# Function to extract text from PDFs using OCR
def extract_text_with_ocr(pdf_folder):
    ocr_texts = {}
    for file_name in os.listdir(pdf_folder):
        if file_name.endswith(".pdf"):
            file_path = os.path.join(pdf_folder, file_name)
            # Convert PDF to images
            images = convert_from_path(file_path)
            text = ""
            for image in images:
                # Perform OCR on each page
                text += pytesseract.image_to_string(image)
            ocr_texts[file_name] = text
    return ocr_texts

# Extract text from the PDFs using OCR

attachments_path = "/path/to/your/pdfs"

ocr_texts = extract_text_with_ocr(attachments_path)

# Display a summary of the extracted texts for review
ocr_texts_summary = {file_name: text[:500] for file_name, text in ocr_texts.items()}
print(ocr_texts_summary)
4 Likes

On a Mac, I managed to solve the issue by opening the PDF in Preview, and then print it as PDF. In my case a PDF file was generated that ChatGPT accepted without complaint.

Dont know why doing this actually fixes the issue, but this works!

1 Like

Same issue, plus PDFs that appear to be successfully uploaded to a Custom GPT are gone at the next session. All PDFs have been through OCR with Adobe before I tried to load them into GDP. I tried uploading them through all 3 windows, same effect. I told it to OCR each new PDF when uploading them. I can get the GDP to process all files correctly in the session that way, in the end I can see them all correctly. But once I shut down and come back in, only a random selection of them remains in my knowledge base. All are similar size, produced in the same way, no clear pattern as to which ones remain and which disappear.

Thank you, this worked on my Mac M1.

I’ve tried taking the content of the PDF, punching it in to Google Docs, saving as a .docx - that also fails.

I’m pretty dumbfounded tbh.
I’ve not been able to do some of the ideas suggested here for different reasons.

1 Like

In case it helps anybody, I had the same issue from my laptop, couldn’t upload a 120kb pdf, about 70% of the file loaded it would fault on “unknown error occurred”.

I saved the pdf on my phone, opened the ChatGPT app and uploaded the file from my phone into a chat with my question. Then, I was able to follow up on that conversation from my laptop.

I have had the same issue in the last few days. Worked perfectly with Claude.

I’ve also been having the same issue, and the fixes above did not work for me.

Same problem, which does not depend upon the size of the file. Digital printing in pdf or converting into .docx did not help.

2 Likes

This worked for me! Thank you!

1 Like

This worked for me. Had a scanned PDF from a scanner that would show unknow error.
Printed the PDF to Microsoft Print to PDF and then it works.

1 Like

Thank you very much, this is working :clap:

1 Like

Thanks, it works for me! My PDF was a scanned document, so it contains images. It’s unfortunate that ChatGPT can’t process non-OCR PDFs

1 Like

yep. printscreen works well too.

1 Like

still happening today, 10feb2025

1 Like

I’m still having this issue. It’s driving me nuts. I’ve tried to generate a new PDF and save it. I’ve run it through a compression program as well, nothing seems to be working.

1 Like

Same happened to me today, regardless of the model I was using (I tried switching between the models, all of them produced the error). My PDF was just 3 pages long, weighted about 1MB, and also was a scan copy from a physical scanner. I also managed to fix the error by re-“printing” the PDF with Microsoft Print to PDF feature. ChatGPT accepted the new version without any problems. Thanks to everyone on this thread for the insights!

1 Like