OpenAI API for image text extraction

Hi guys,
Can I use current OpenAI API to upload jpeg or PDF file and extract contextual data in JSON format.
In our case we have scanned purchase bills which need to be parsed into our local database.

Hi and welcome to the Developer Forum!

Sounds like a task for Assistants, you can find more here:

Assistants API does not allow to access the processed file:
“Not allowed to download files of purpose: assistants”

Assistants is an API calling system, you can process data in any form and way you like, what files are you trying to access?

This is the path Im following:

curl …/v1/files
-H “Authorization: Bearer {API_KEY}”
-F purpose=“assistants”
-F file=“@b2.pdf

curl …/v1/assistants
-u :{API_KEY}
-H ‘Content-Type: application/json’
-H ‘OpenAI-Beta: assistants=v1’
-d ‘{
“instructions”: “…”,
“tools”: [{“type”: “code_interpreter”}],
“model”: “gpt-4-1106-preview”,
“file_ids”: [“file-ID”]

curl …/v1/files/FILE-ID/content
-H “Authorization: Bearer {API_KEY}”

“error”: {
“message”: “Not allowed to download files of purpose: assistants”,
“type”: “invalid_request_error”,
“param”: null,
“code”: null

Note that the Assistants API does not currently support image inputs.

You can find more on the OpenAI GPT-4-Vision docs page…

Hope this helps.

Hi Guys,
I upload .docx file than i retrieve with file id both endpoints working fine but when i try retrieve file content the response i am getting is:
Note i upload file with purpose assistants

    "error": {
        "message": "Not allowed to download files of purpose: assistants",
        "type": "invalid_request_error",
        "param": null,
        "code": null