ChatGPT Apps SDK - retrieving the attached files from the ChatGPT client context from registered tool

Hello,

I am building a ChatGPT app using the modelcontextprotocol sdk for typescript.

In my server code i’ve got this function:

private _registerTool_ProcessImage() {
    this._mcpServer.registerTool(
      'process-image',
      {
        title: 'Process Image',
        description: 'Receives and processes an image uploaded by the user',
        inputSchema: {
          description: z.string().optional(),
          imageUrl: z.string().describe('Publicly accessible url to an image'),
        },
        outputSchema: {
          message: z.string(),
        },
      },
      async ({ imageUrl, description }) => {
        try {
          // Fetch the image from the URL
          const response = await fetch(imageUrl);
          if (!response.ok) {
            throw new Error(`Failed to fetch image: ${response.statusText}`);
          }

          const imageBuffer = Buffer.from(await response.arrayBuffer());

          // Create uploads folder if it doesn’t exist
          const uploadDir = path.join(process.cwd(), 'uploads');
          if (!fs.existsSync(uploadDir)) {
            fs.mkdirSync(uploadDir);
          }

          // Save the image
          const fileName = `image-${Date.now()}.png`;
          const filePath = path.join(uploadDir, fileName);
          fs.writeFileSync(filePath, imageBuffer);

          // Respond to the client
          return {
            content: [
              {
                type: 'text',
                text: `✅ Image saved as ${fileName}. Description: ${description || 'none'}`,
              },
            ],
            structuredContent: {
              message: `✅ Image saved as ${fileName}. Description: ${description || 'none'}`,
            },
          };
        } catch (error: any) {
          console.error('Error processing image:', error);
          throw new Error('Failed to process image.');
        }
      },
    );
  }

This function works but the problem is that the user needs to provide an URL which is publicly available, which for non-technical users might be problematic.

Perfect scenario would be to allow user to upload the image into the ChatGPT chat window and then somehow retrieve that file on the tool invocation from the input or context.

Is there a way of doing that?

Thank you for your responses in advance :slight_smile: