I wanted to extract information from invoice using GPT-4o, which can be image or PDF

I have an API that accepts image or PDF invoice and extract the information from it and respond in json format.

In my previous implementation i’ve used Azure document processors (invoice) for extraction and Open AI API for customizing the response.

but now i wanted to switch fully to openAI api.

is there an API that can support this?

2 Likes

I’ve found this File uploads FAQ | OpenAI Help Center article, saying the API version for file upload will be available soon.

the article is posted a week ago, but please feel free to share if there is any latest news about it.

Welcome @bisratx

If the goal is to simply extract info from a pdf invoice, you can do it with chat completions API using vision capability.

Just convert the uploaded PDF doc’s pages into image files with supported format and consume them over the vision modality to extract info you want.

1 Like

I am sorry, but is that an assumption or did you run a 600 page AWS invoice through it and got all the right values?

1 Like

converting the pdf to an image may add load time to the API which will make the time the same as the first approach