GPT4 OCR/Image Recognition

Dear Reader, For a few weeks now, it has been possible to use GPT-4(V) visual. It seems like an alternative to OCR, or maybe it is OCR. Is it possible to use this functionality with my GPT-4 API??

Unfortunately we haven’t heard anything regarding vision on the API.

I’d be super interested to see how they will handle billing.

That said, I don’t think it will be a drop-in replacement for OCR at this point. these LMMs are pretty inattentive readers, although that could maybe be overcome agentically.

GPT-4 has been trained to refuse outright OCR requests, although you can see its skill in reading when you work around this. However, there is OCR software that doesn’t take a quarter-million-dollar stack of servers to save you typing in a page.