Hey,
I recently discovered that I could pass an image to GPT-4o in the web app and prompt it to crop it square, keeping key details in the image. It responded with a URL to download the cropped image.
I’ve tried recreating this functionality using the chat completions endpoint per the vision docs, but this doesn’t work. It either responds saying it cannot crop an image, tells me to use photoshop or provides a fake URL.
Does anyone know if there is an endpoint this functionality is available from?
Many thanks in advance.
It turns out this was just an agentic workflow in the web app, with the model generating appropriate and simple Python scripts to execute in the REPL. The crops were just the center of the images.
1 Like
Instead of asking gpt-4o to crop the image for you, instead ask it to provide you a bounding box of the image’s main context. Tell it to response with x-coordinate, y-coordinate, width, and height in pixels. Then when it gives you those, you can use them with an image processing library (e.g. PIL in Python) to crop the image to the bounding box