Custom GPT to load a hosted picture and request a new picture

djsapsan · January 21, 2024, 4:46pm

Hi.
I’m building custom GPT that should be able to comment on pictures and make new pictures on demand.

I’ve tried to solve this in different ways. Currently I’m hosting a small server and sharing a static picture. Everyone can see that picture and I can share the link for everyone if anyone wants(it uses self-signed cert for HTTPS).

My local test API calls and direct link in the browser are working perfectly fine.

On the other hand, GPT refuses to see it or use API in any way.

The same is for API calls from custom GPT actions. My server doesn’t even detect requests.

I can give additional information if needed.
Can someone explain why it’s not working?

_j · January 21, 2024, 4:58pm

The “error browsing” may be further “untrusted” actions or untrusted http document types. You’ll need a verified builder domain, schema, authentication, privacy policy, etc. and the ability to then publish to “everyone” to then make those custom actions.

Then the tool “browsing” is going to scrape page data.

I still shouldn’t expect that ChatGPT vision can “see” the contents of an image; The only way that an image can be perceived is to have it be part of a user message to the GPT-4 model that supports computer vision. There is no other path of loading or analyzing images except on demand of the user (or an API user placing those images into a user role message).

https://platform.openai.com/docs/actions/introduction

djsapsan · January 21, 2024, 5:43pm

I think you are right. I assumed it could see images, as I used GPT to browse other sites, but it was actually only text.
So maybe I can use GPT assistants for this? I doubt that I will use it, as it has additional pricing

djsapsan · January 21, 2024, 9:49pm

@_j , so now I’ve added an API for text. But GPT still can’t see it:

In my browser:

On the server:

My schema is the following:


openapi: 3.0.0
info:
  title: Screenshot API
  description: API to interact with a server that provides a static image and allows taking screenshots.
  version: 1.0.0
servers:
  - url: https://server
    description: Main server hosting the pictures and chat functionality.
paths:
  /chat:
    get:
      operationId: getChat
      summary: Retrieves the latest chat messages.
      description: Returns the latest chat messages at the specified URL.
      responses:
        "200":
          description: Latest chat messages.
          content:
            text:
              schema:
                type: string

Topic		Replies	Views
GPT can't use API with actions GPT builders gpt-4 , api	5	749	February 2, 2024
Problems with a response image from a GPT Action Plugins / Actions builders gpt-4 , chatgpt , api , chatgpt-plugin	1	992	January 23, 2024
How to Get ChatGPT to Describe Images Served by My Server? Plugins / Actions builders gpt-4 , plugin-development , gpts	0	780	December 23, 2023
Actions that return images for inline viewing Plugins / Actions builders chatgpt	0	50	January 3, 2025
How to modify schema of custom GPT action to send an image file with post request? Plugins / Actions builders plugin-development , openapi , chatgpt-plugin , actions	17	6082	February 2, 2024

Custom GPT to load a hosted picture and request a new picture

Related topics