Hi all,
I would like to know if, when I upload a file or PDF containing both text and i
Thanks in advance!mages, OpenAI’s vector store creates embeddings for the images as well, or just the text? Later, I plan to use this assistant with a file search tool and GPT-4, which has vision capabilities.
Hi @sekharmuni003 ! I believe so - there is a vector representation of the image, and an image to OpenAI models is really just a sequence of tokens.
1 Like
sorry, my post is not clearly formated.
any way what i want to know is:
i want to know if i upload a file or pdf with text and images in it, will the openai’s vector store creates embeddings to images as well?
because later i want this assiatant to use file search tool and query with images too.
1 Like
No worries @sekharmuni003 !
So the docs state that there is no image parsing under the hood right now, but “it will be coming soon”.
I am not sure if this will be helpful to you, but what we’ve done in the past is actually have a separate job where we:
- Convert PDF pages to images
- Send the images to Vision API as base64 encodings together with a comprehensive system prompt where we essentially extract certain types of information (e.g. charts and tables converted to Yaml representations)
- Upload those together with raw PDF into Assistants API
1 Like
again, very thankful for your input @platypus !
2 Likes