How can I retrieve data from a PDF that was created from an image captured by a camera?

Is there a way to retrieve text from a PDF created from a camera-captured image using the Assistants API?

Unfortunately there’s no way to do this at the moment – we don’t parse images in documents yet.

2 Likes

Would recommend doing some pre-work and use a library to grab each page convert to image and feed the image to GPT Vision to give you the text. However I imagine doing some local OCR process would work as well and be cheaper.

1 Like

Before now, I can upload images to poe using the gpt-4 model and get responses, but as at yesterday, this is no longer possible.
Also, same thing while using the api via openweb ui