Using OpenAI Vision API to analyze Images doesn't work!

DAYS I’ve spent trying to figure this out. WTF am I doing wrong? I just want to use my API to access OpenAI’s Vision to analyze an image I upload or take and tell me what’s in it, like I can do in GPT-4-Turbo chat by just uploading a freaking image.
No end point works. No code that I can come up or AI can come up with works! Its not listed ANYWHERE. I DON’T want to just analyze text in an image. Chat/completions my ass. Messages? Pfffft! F***ing ChatGPT-4-turbo is out of date and has no freaking clue what to tell me. I dare you to find an endpoint that works. Every AI out there hasn’t a clue. Been running circles upon circles upon circles!
HELP!!
I should be able to just take a photo, upload it via my api (with conversion to base64) and get a response. But nadda.

PS: I need solutions in swift because I’m using Xcode. Ty!

No, you shouldn’t.

Please read the documentation: https://platform.openai.com/docs/guides/vision

What does this say? Do you interpret this differently? :
“Creating image input content.
Message content can contain either external image URLs or File IDs uploaded via the File API. Only models with Vision support can accept image input. Supported image content types include png, jpg, gif, and webp. When creating image files, pass purpose=“vision” to allow you to later download and display the input content. Currently, there is a 100GB limit per organization and 10GB for user in organization”

What does this say? It says you have to stay on top of discovering all facets of things that don’t have a big announcement yet and require holistic comprehension:

Especially when citing documentation just committed 13 hours ago.

You don’t even mention that you are trying to use Assistants, where 24 hours ago it would have said “vision is not supported”.

1 Like