I am testing some OCR image-to-text parsing with GPT-4o using both the ChatGPT UI and the OpenAI chat.completions.create
endpoint. I have a few questions I would like to get your help and input on.
-
I am trying to understand why the ChatGPT UI is performing much better at extracting information from images correctly. When I use the same GPT-4o with the OpenAI
chat.completions.create
endpoint, I encounter many errors and random pieces of information that are not present in the image. -
My assumption is that this discrepancy is related to the parameters of
chat.completions.create
, such as frequency_penalty, temperature, top_p, and max_tokens. Is there documentation where I can find out the settings that the ChatGPT UI uses to interact with GPT-4o? -
Is GPT-4o the right pick for OCR type of tasks?