Chatgpt 4o API For Sending Both PDF and Images

I am trying to write my app that can send both images and pdf attachments to ChatGPT 4o.

I am aware that using the openAI Assistant feature with FileID makes reading PDF possible with ChatGPT4o. And I am also aware of using normal completion API with image_url makes reading image possible.

The annoying part is that the Assistant feature doesn’t support images, and on the contrary sending PDF as image_url with completion API results in “Invalid MIME type. Only image types are supported.”

Do I really have to check file extension first to determine to use Assistant or Completion?

1 Like

Welcome to the community!

what makes you say that?

https://platform.openai.com/docs/api-reference/messages/createMessage

2 Likes

Thank you Diet.

I see that Assistant thread message creation has this image_url feature, but it doesn’t work quite the same way as the CompletionAPI where it can accept local image_url rather than the thread message only accepts external image_url.

I guess I could first determine if the extension is non-pdf file, then I will upload the image file to an image hosting service to provide the image_url to thread message.

Is that the best approach for now?

1 Like

The Assistants API supports file upload of images first to your API storage, and then you can attach an image to a user message placed into a thread by using the file ID received.

That method is clearly depicted in the image of the expanded API reference above.

This will not currently work though, because OpenAI broke the API and has not rectified the issue for a week.

3 Likes

I see their API is giving me an error message “Sorry, something went wrong.” and because of that I thought the Thread Assistance does not have image functionality yet.
Thank you for clarifying this!

1 Like

Images with Assistants still don’t work, if someone was wondering

1 Like

Has anyone gotten this to work yet? And uploading the image to a hosting service did not work since it needs to be a trusted public image source like Wikicommons, BBC, etc.

1 Like

sorry for the delay, I just noticed your question. I am using chat completions for images and assistant for pdfs. Then I am getting their answer on my main assistant. Connecting those using function calls, and sometimes calling the needed pdf-assistant/image-chat-completion directly when I am receiving a file, not URL. For the context, I am doing these for WhatsApp and Telegram chatbots

1 Like

It works!
using type = image_url or image_file

https://platform.openai.com/docs/api-reference/messages/createMessage

1 Like

Hi there,

Can you give a code example how to make it work. I am still learning and having difficult to upload image.

Thank you