Best Approach For Analyzing Imagery (Written Paper)

Hi there,

I have an AI Feedback Generator that I would like to provide feedback on image uploads (written text).

What is the best approach + endpoint for this use case?

Should I extract the text from the image, then send the text to the Chat Completions endpoint? Or use the Assistant endpoint to achieve this?

Thank you for your recommendations.

Welcome to the community!

You might try OCR first then send the text to Chat Completion. The vision models are getting a lot better, though, so one of the newer ones might work.

Have you tried anything yet?

If you search the forums, you should find some relevant information.