Dear OpenAI Team,
First of all, thank you for the amazing product you’ve built with ChatGPT. As a daily user, I truly appreciate its capabilities, including the support for image input — it’s an incredibly powerful feature.
However, I’d like to propose a new feature that I believe would significantly enhance the usability and user experience of image-based interactions: the ability to annotate images directly within the ChatGPT interface before sending them.
Problem:
Currently, if I want to highlight or annotate something on an image (e.g., draw a circle, arrow, or erase a part), I need to do this using a third-party tool before uploading the image. This adds friction and slows down the workflow, especially when I just need to quickly indicate something visually for ChatGPT to analyze.
Proposed Feature:
Add a basic image annotation layer in the ChatGPT interface (for example, in the web version), allowing users to draw directly on top of the image before uploading it. This could include:
- Freehand drawing
- Arrows or lines
- Circles/rectangles
- Basic text markers
Implementation Suggestion:
To avoid interfering with the original image processing pipeline, the system could treat user annotations as a separate metadata layer (a marker map):
- The original image remains unchanged and is sent as-is.
- A second data layer stores the user-generated annotations (coordinates, shapes, labels, etc.).
- When ChatGPT processes the image, it also receives the marker map to understand what the user is pointing to or emphasizing.
This way, the internal vision model logic doesn’t need to be changed — it just gains additional context, which is immensely useful for precise communication.
Benefits:
- Enhances clarity in visual communication.
- Saves users time by avoiding the need for external tools.
- Retains the original image for accurate analysis.
- Improves collaboration and teaching use cases.
I believe this feature would be incredibly useful not only for technical users like myself but also for educators, designers, and anyone who uses ChatGPT for visual tasks.
Thank you for considering this idea. I’d be happy to provide further details or use case examples if needed.
Warm regards!