Chat with images is rolling out now

N2U · October 6, 2023, 2:51pm

Actually a good idea, it would be quiet interesting to see what happens

PaulBellow · October 6, 2023, 5:22pm

After asking, someone on Discord fed it a page, and it identified it, but they didn’t try to translate it …

N2U · October 6, 2023, 5:39pm

Alright, it was a long shot

Just got done reading the research paper, and I’m really impressed so far, it’s much better than I expected.

vb · October 6, 2023, 5:39pm

After giving it a shot I can confirm that the results are not spectacular.
It mostly goes on and on about the document being medieval, takes a stab at the style (Gothic, 12-15th century) but never really makes any interesting statements.

PS. Sharing links with images is not yet supported. @PaulBellow

PaulBellow · October 6, 2023, 5:43pm

No worries. Thanks for taking a stab at it! Was curious what it would “guess”… there’s been a lot of theories over the years.

ETA: Saw another screenshot on Discord but it wouldn’t guess… It seems like it’s relying on textual stuff rather than the image… or taking what it “knows” about the image “it’s the Voynich” document but is just gathering vectorized data about the “image”? hrm…

_j · October 6, 2023, 5:53pm

It does make one wonder exactly how the prompting and context of images works in GPT-4. Can it be trained by example images? Does it have the context required to hold image data or is this processed by a different sub-model of the architecture that only returns language?

weiduan1025 · October 6, 2023, 7:29pm

Thought it would be rolled out together with Dalle3 until I failed to find it in my chatgpt UI. No?

supershaneski · October 6, 2023, 11:30pm

Yeah perhaps one needs to prod GPT-4 using some prompts and not just ask it to describe what is the document. Ask it to find patterns or whatever and maybe it can tell us more. I think like for us human, you show one a picture of Mona Lisa and the person will just tell you it is Mona Lisa. But if you tell the person about the smile, scenery, etc. and perhaps the person might give their interpretation.

abrahamlivinus · October 7, 2023, 2:26pm

It’s something worth trying and if it locates Wally, I will be impressed.

Foxalabs · October 7, 2023, 5:26pm

N2U · October 7, 2023, 5:34pm

Bonus question: ask for the precise bounding box, then feed the results and the image to the advanced data analysis tool and ask it to draw it

anon22939549 · October 7, 2023, 8:22pm

I wonder if there’s any way to generate an attention heatmap of the image and, if so, how well that would correlate to locating Waldo?

PaulBellow · October 7, 2023, 9:20pm

Still waiting here. Anything else cool you’ve tried? Give it some history stuff?

Foxalabs · October 7, 2023, 9:24pm

I’m also waiting, that was appropriated from twitter.

PaulBellow · October 7, 2023, 9:28pm

Ah, thought it looked familiar!

Hope your weekend is going okay.

Foxalabs · October 7, 2023, 9:30pm

Yup, all good. Just watched Sam Altman chatting with Joe Rogan, was a fun interview.

anon22939549 · October 7, 2023, 10:33pm

Another,

anon10827405 · October 7, 2023, 11:01pm

I’ll trade my vision for your painting .

Here’s to hoping some powerful API capabilities. Bounding boxes and labels would be sweet as well

yuriyvj · October 10, 2023, 8:11am

Still haven’t received access. Is it only on the phone app? (I checked both)

Foxalabs · October 10, 2023, 9:28am

There is currently a limited number of beta testers with access, this will be increased in time, please be patient while testing is being done.

Topic		Replies	Views
ChatGPT goes Multimodal! Sound and vision is rolling out on ChatGPT Community chatgpt , multimodal	34	13911	December 10, 2023
DALL-E 3 Announcement, Coming Soon Community dall-e , dall-e-3	40	22055	October 10, 2023
Any update on GPT-4 vision? API	6	3251	December 17, 2023
Can we use images with GPT-3 API	9	3777	November 22, 2022
FAQ: When can I start generating a capybara image using DALL-E? API	25	2714	January 3, 2024

Chat with images is rolling out now

Related topics