DALLE-2 Early Access... I am not a robot! (I think!)

DutytoDevelop · April 18, 2022, 12:15am

This thread’s subject line just gave me the idea that DALL-E 2 could essentially be used to analyze captcha images and then manipulate the images to be OCR-friendly for cases where OCR fails to produce the correct captcha. I mean that would be a genuine use-case but is seen as a way for people to circumvent anti-bot validation systems.

Aside from that random thought, it would be cool to see DALL-E 2 produce text descriptions from images in the same way it produces images from text descriptions. Would having that feature be considered popular enough to implement in the future? I’m not too knowledgeable into how contrastive models such as CLIP operate to know if there’s a simple way to just reverse the input & output to make that idea a reality, but figured I’d put it out there in hopes someone who is knowledgeable in that area can explain the variables that would be at play here to see how easy that would be to implement in future iterations of DALL-E.

Topic		Replies	Views
FAQ: When can I start generating a capybara image using DALL-E? API	25	2703	January 3, 2024
A few things I learned about using DALL·E Community	16	2510	December 17, 2023
ChatGPT goes Multimodal! Sound and vision is rolling out on ChatGPT Community chatgpt , multimodal	34	13855	December 10, 2023
4o image generation: WOW! Feedback	12	5851	March 27, 2025
DALL·E Mastery: Seeking Vibrant Community and Top-Tier Talent API api , dall-e-3 , dalle3	5	793	May 13, 2024

DALLE-2 Early Access... I am not a robot! (I think!)

Related topics