Knowledge base - Image handling

Hi. What is the best way to upload multiple image files in a custom GPT. And to have the GPT retrieve the images.

I have tried the following but with very little success:

  • All images are JPEGS and file names are labled with what the image represent e.g. “Kitchen Fridge”
  • Images were zipped into 1 file and uploaded into the GPT (however the zipped file disappears every time after a successful upload)
  • I then created a Word doc with a table of various images and labled and described each image
  • Then asked GPT to retrieve the image in the file - and it mostly gets it wrong or just retrieved all images
  • it seems to use Matplotlib Chart to retrieve the images

Any guidance on the best way to upload multiple images into a custom GPT and how to retrieve the images?
Thanks

2 Likes

What is it specifically you are looking to achieve?

There should not be problem with the upload but I am not sure I fully understand what you mean by “retrieve” in this context. Can you elaborate on the purpose of your GPT and the role of the images?

Also, if you could share instructions that you use and example prompts, that would be helpful.

1 Like

Images cannot be used as a source of knowledge or uploaded for continuous wasteful vision tasks to a GPT when you are creating it.

Vision can only be used on files attached to a user message in the course of using the chatbot.

If there was something an image said, it would be better said as text.

Hm. I was able to upload pictures into the knowledge base and prompt the custom GPT to return a description of the images in the role of the GPT user (note: this only worked when code interpreter was enabled).

I am building a custom GPT to act as our ‘Home Knowledge Base and Advisor’ - essentially I have provided it with various documents where I kept record of maintenance or new installations that was done in our house; as well as landscaping improvements. We have a large garden and so I also provided it with names and descriptions of various plants/trees. I also want to upload pictures of the various home appliances, plants, trees, etc and anything house related that is interesting.

Amongst other things I want the GPT to be able to retrieve and show me images of whatever I uploaded e.g. I will ask it:

  1. what model fridge do we have in the back kitchen?
  2. this fridge is not dispensing ice, search for a user manual and help me troubleshoot it
  3. now show me a picture of the fridge from the uploaded images

It responds to question 1 and 2 very well - but cannot retrieve a simple image from its uploaded docs

I uploaded images as part of a Word doc and labelled it within the word doc. However the GPT struggles to correctly identify an image and then just retrieves all images

I am assuming you are using the Chat GPT client and not via OpenAI?

OpenAI has an assistant feature which allows document uploads. You can configure it however you want with prompts to perform specific tasks etc.

This one saves your uploads as vector images and manages documents really well.
It costs money though, it charges in tokens.

Sorry I’m new to this so not sure if I understand you. I have a ChatGPT Plus account (paid) and am using it to build my own custom GPTs where I can provide custom instructions, load docs, etc. It works very well, except I haven’t figured out how to make it handle multiple uploaded images well.

Not sure what you mean by “via OpenAI”?

I believe they’re talking about the API… see (quickstart…) Basically, you build your own “ChatGPT”… It’s a lot more work, but it gives you more control too.

Have you turned code interpreter on in your Custom GPT when you set it up?