Possible to send screenshot for analysis using action?

justin.collery · November 15, 2023, 9:54pm

I am developing a custom got which will perform actions on a users computer. This could be for data entry, tech support, lots of possible use cases.

One of the actions I would like to be able to do is take a screen shot and have gpt4 vision analyse the screen shot. I have an API end point which will do this, but when my gpt calls it I get an error. Is this even possible? Here is my action json

{
  "openapi": "3.1.0",
  "info": {
    "title": "Browser Automation API",
    "version": "1.0.0",
    "description": "API for performing various browser operations."
  },
  "servers": [
    {
      "url": "https://abc.com"
    }
  ],
  "paths": {
    "/screenshot": {
      "get": {
        "summary": "Take Screenshot",
        "description": "Takes a screenshot of the current page and returns the image.",
        "operationId": "takeScreenshot",
        "responses": {
          "200": {
            "description": "Screenshot taken successfully",
            "content": {
              "image/png": {}
            }
          },
          "default": {
            "description": "Error",
            "content": {
              "application/json": {
                "schema": {
                  "type": "object",
                  "properties": {
                    "status": {
                      "type": "string"
                    },
                    "message": {
                      "type": "string"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
	}
	}

Has anyone managed to do anything similar? The API end point works and will display the current screenshot so that side appears good, it just seems the gpt part is not ingesting the image

(text output and ingested fine from the same endpoint)

Thanks and happy gpt’ing!
Justin

weatherley.john · December 3, 2023, 12:01am

Any luck with this? I’m trying to do the same thing.

Topic		Replies	Views
Problems with a response image from a GPT Action Plugins / Actions builders gpt-4 , chatgpt , api , chatgpt-plugin	1	985	January 23, 2024
How to send image to GPT Action and send image from Action to user Plugins / Actions builders chatgpt , plugin-development , actions	11	8622	December 28, 2024
How to modify schema of custom GPT action to send an image file with post request? Plugins / Actions builders plugin-development , openapi , chatgpt-plugin , actions	17	6020	February 2, 2024
GPT using vision capabilities for images returned from actions? GPT builders gpt-4 , gpt-4-vision , gpts , gpt	4	815	March 18, 2024
Custom GPT to load a hosted picture and request a new picture GPT builders gpt-4 , gpt-builder , website	3	721	January 21, 2024

Possible to send screenshot for analysis using action?

Related topics