Redacting images using Dall E

gls · January 25, 2024, 10:15pm

Has anyone used Dall E, etc. to redact an image (block out text)? It seems like the end user ChatGPT will do this or some semblance of it but the same prompt to the edits endpoint to Dall E just returns the same images. For my use the text may not be in the same place or I’d just use a stock image processor for this.

PaulBellow · January 25, 2024, 10:17pm

Do you have an example of what you mean? Redacted usually means placing a black bar across something.

DALLE3 doesn’t have edit endpoint yet either.

You’d need to probably use Vision first too to find / locate any text on the image.

gls · January 25, 2024, 10:26pm

Blocking out text is what I’m talking about (think automatically covering car tags in an image with a dozen cars). Dall E 2 has an /edits endpoint that sounds like it should work but so far I haven’t seen it do anything to the returned image.

The end user version will do it in a limited manner though it doesn’t seem to be calling Dall E if its answers are to be believed.

PaulBellow · January 25, 2024, 10:28pm

Right, but for edits endpoint to work, you have to mask the image in those spots, so you would need to either mark them manually or maybe use GPT-4-Vision to try to spot them? Sounds interesting, but I’m not sure the tech is there yet?

ETA: You might not even need edits endpoint if you’re just redacting it with a black bar… you’d still need to find what needs to be redacted, though, then just use those coordinates to use ImageMagick or something to mark them…

gls · January 25, 2024, 10:54pm

Vision says it can’t help with spotting text like that. The exact same image I can get it to describe in great detail but giving me coordinates it isn’t able to do.

_j · January 25, 2024, 11:08pm

DALL-E 3 does not accept images as input.

DALL-E 2 on the edits endpoint allows AI infill of an area made transparent, only on images of supported size 1024x1024, 512x512, or 256x256.

GPT-4-Vision AI does not support grounding, which would allow returning the location of detected objects. You could use a large local or hosted model such as minigpt-v2 or Azure vision products capable of bounding boxes. Training on a particular subject (like license plates) would take particular tuning.

Topic		Replies	Views
Create Image Edit With DALL-E-3? API dalle2 , image-generation , dall-e-3	4	4038	April 2, 2024
Is masking available through the API in 4o? API	5	288	March 30, 2025
Using an image mask to define generation area Prompting chatgpt	4	838	November 29, 2024
Can you Provide Examples To DALLE Prompting gpt-4	2	1404	December 20, 2023
DALL-E is illiterate with the text it adds in images Prompting chatgpt	28	9492	July 13, 2024

Redacting images using Dall E

Related topics