Problema al subir PDFs escaneados a la API de OpenAI

Hello OpenAI team,

I’m encountering an issue when using the file upload API at https://api.openai.com/v1/files. I’m uploading a PDF file that contains a scanned invoice (i.e., the content is a single embedded image), and then I query its content using the completions or chat/completions endpoint.

The problem is that the model does not seem to interpret the visual content of the PDF correctly. Instead of extracting meaningful information from the image (such as invoice number, date, or total), it only retrieves metadata or digital signature details from the PDF.

I understand that the API is designed to process embedded text within a PDF, but I would like to confirm whether this behavior is expected when the content is a scanned image. If so, should I perform OCR preprocessing before uploading the file? Or is there any upcoming feature planned that will include automatic OCR support?

Thank you for your attention and for the excellent work you’re doing.

Best regards,