GPT-4o: OpenAI spring product announcements 2024

elm · May 11, 2024, 5:02pm

Speaking of fine-tuning…

I’d love to be able to fine-tune the 3.5-instruct model and the embedding models.

Beyond that, I’d love a gpt-4-instruct model and the ability to fine-tune it.

dignity_for_all · May 11, 2024, 5:05pm

I feel like the “a little patience, please…” thing will never be released😂

eawestwrites · May 12, 2024, 12:02am

Anyone else’s weekend just turn into “ugh I can’t wait for Monday!”

It’s like Christmas but I don’t have to play Santa.

_j · May 12, 2024, 2:28am

A demo that assistants can use GPT-4 vision to perform some tasks. Kind of like demo’d in March 2023. But with assistants.

elm · May 12, 2024, 3:10am

Alternate idea…

Anyone else remember that OpenAI had MuseNet way back when?

Perhaps now that Suno and Udio have released a flood of AI-generated music, OpenAI will feel more comfortable dropping their own version.

If it’s substantially better than existing options, that could be “magical.”

_j · May 12, 2024, 3:18am

Also, you can poll the models endpoint during the video to see the strategic timing…

PaulBellow · May 12, 2024, 3:20am

I miss Labs for DALLE… So many things going on! Heh… they are one of the largest companies or getting there…

andrew76 · May 12, 2024, 5:42am

Question is, what would be ‘magic’ that is not search, or 4.5, or agentic sub-AI’s? What possibilities are there, that would be deemed ‘magic’? This question assumes we’re already pretty stunned by the magic of its current capabilities. So, more magic than it already is? But is not search, or 4.5, or agentic? We already have voice conversation capability - so, an enhanced version? That would be ‘magic’?

One thing OpenAI is talented at, is stoking the fires of expectation! There’s one caveat to that approach - it’d better be magic…

dluttrell87 · May 12, 2024, 10:52am

My guess is it’s going to be a first party ai as an agent demo.

inouemilou · May 12, 2024, 12:56pm

I agree with you. I would expect an ear device or a microphone device or some other wearable device.

DavidOS366 · May 12, 2024, 1:09pm

For the love of god, please higher rate limits for images. 200 Images per day is low.

vb · May 12, 2024, 4:47pm

24 hours to go.

Not aware of any actual leaks but speculation is welcome. And of course you are invited to share anything related here

juslee · May 12, 2024, 5:46pm

A deal with Apple, who use the word “magic” a lot?

mike1 · May 12, 2024, 6:06pm

I’ve got it. It’s an inter species translator. Humpback Whales will be first. Researchers will be able to wear it when they’re diving. Then house cats. To get more insights into their mania. The rollout is sea animal then land.

Or:

The Open AI GenerativeAI Virtual Reality Glasses. Where you can experience the world from the viewpoint of the person you’re talking to.

vb · May 12, 2024, 6:16pm

Following three years of deliberation @mike1 returns with a glorious prediction.

Regarding the glasses: let’s start with viewing the world through the eyes of GPT 3.5: You know everything but understand nothing.

mike1 · May 12, 2024, 7:18pm

Thank you @vb ! I do what I can for my community I’ll see you in three years, when my ‘empathy glasses’ are in full production.

Hahah- yea, they built in a trigger that fires when someone is about to make an ass out of themselves, the chat stops and says, “based on your line of questioning, it would be careless of me not to suggest that, while I am in fact providing the answers to your questions, it would take (1)one person with an understanding of the information to ask you a (1)one question to find out you’re full of (1)shite.” Or something to that effect.

That’d said. While they say no search, I bet it’s a service like Tavily or something that can augment responses with web info from ChatGPT. It’ll launch as a button “include web results” or something.

And for devs it’ll be that web search into the MyGPT tool without having to build your own function for it. And integration with Assistant API and others.They said no search. but who knows? I think it’s what engineers need the most, and made easy to use , realtime web integration. So we’ll see. I’m excited!!

elm · May 12, 2024, 7:25pm

Another interesting idea I’ve heard swirling around is a direct voice-to-voice model with no intermediary text state.

The idea being the model would be able to understand vocal cues, e.g. emotional state from tone of voice, sarcasm, etc.

That would truly feel magical.

Lazzereth · May 12, 2024, 7:36pm

Probably a stand alone gpt model, 2 for example, that is significantly smaller and more efficient for its size than 4, and is as good as 4 is currently. Maybe it will be imbedded into the next windows operating system offering significant improvements to the way we interact with our computers.

Macha · May 12, 2024, 8:15pm

This has the most evidence for what’s to be what’s announced Monday, at least from what I can tell.

source: https://www.theinformation.com/articles/openai-develops-ai-voice-assistant-as-it-chases-google-apple

Lots of OAI staff on twitter seem to be winking and nodding towards something like this.

This would be really interesting if they would consider an audio-in, audio-out API, but I doubt that would come before whatever this product is atm.

RonaldGRuckus · May 12, 2024, 8:18pm

Rabbit & Humane investors be like…

Wait… WAIT…

.

Anyways,

I’m interested in this but I don’t see what separates this from what the ChatGPT app already offers. I like having a transcription

Topic		Replies	Views
DevDay conference discussion Community devday	11	1145	October 1, 2024
OpenAI has begun training its next frontier model Community in-the-news	29	3886	June 5, 2024
AMA with the OpenAI o1 team Community o1 , o1-mini	1	371	September 13, 2024
Microsoft & OpenAI (speculation) Community	1	497	February 6, 2024
What do you expect from surprises? API	2	459	March 7, 2023

GPT-4o: OpenAI spring product announcements 2024

Related Topics