Announcing GPT-4o in the API!

If it wasn’t for the absence of a real “modify user message” method, and that you cannot add messages while a tool result is waiting, you can imagine a convoluted tool for AI for image links mentioned in text, “get internet picture” that creates additional user messages.

Or abandon Assistants. Chat Completions lets you and your vision AI control everything.

Assistants are perfect for my purpose (task processing, including emails) but this img_url seems a little odd.
As far as I can see the OpenAI Python library still does not support this type of message creation, so my users will have to wait for this feature. (It only takes text as content)

Will it eventually have api access to generate images? Does the model have that capability? And how does the quality compare to dalle3 please? Thanks for any info.

Yes.

Yes.

Generally it seems better, it has enhanced capabilities in generating text and consistent characters.

3 Likes

French approach does not work correctly. Sometimes the sentences are correct and it forget words and the sentences become worse and worse.
Gpt4-o is not able to create a good text in french.

there an example, if you understand the french, you will see lot of error, words forgotten, no sense on sentences …
I do no test other language but in this case gpt-4o is not relevant for this moment on this way.

En conclusion cette analyse révèle globalement profil très compétent alignant largement attentes position senior management digital transformation/agile project management bien qu’il subsiste quelques zones amélioration notamment autour aspects technologiques spécifiques IA avancée ainsi présentation optimisée informations critiques cependant force globales combinées opportunités significatives développement futur rendent candidature particulièrement attrayante perspective long terme réussite organisationnelle postulant

As you there is some errors. That s only example.
Gpt-4 turbo and 3.5 does not make theses mistakes.

I’m guessing this is a complaint about GPT-4o capabilities?

Well, if it is trying to directly translate an English input to French output, it might be wise to consider that the input must also be grammatically correct and without error.

Yes exact. I can have document english to french or french to french. This is the same thing. The document start well and does not finish well.
No idea what s happen. With gpt 3 turbo and 3.5 this problem does.not appear. Everything is correct.

Can you provide a document in English, and your accompanying prompt to the model? You did not write the English yourself, correct?

Hi,

Really puzzled by the name of the newest product. So, I asked ChatGPT about it. The result was interesting.

I’m not talking about the part where the current chat GPT is unaware of ChatGPT 4o. That’s understandable.

My question to you:
Why does ChatGPT seem to understand the concept that I understand, but no one at OpenAI making planning decisions seems to understand?

See attached screenshots.

(And yet another stunningly dumb decision, I was prevented from uploading the second screenshot to make my question clear. I’ll try to do it in a succeeding post.)

This is the second screenshot I tried to upload before.

You want gpt-4-lazervision? Or gpt-4-sparse-4bit-20b?

Another letter suffix for model name was leaked or guessed, perhaps meaning “lite”, so it might have been a quick naming switch to prove everyone wrong.

Adding a ‘memory’ improves answering, about what the AI might assume is just you making up stuff otherwise.

1 Like

The great thing about GPT-4o in ChatGPT is that it allows users to know which model was used to generate each message.

check

1 Like

how do I get my company ******* on the good list please :slight_smile:

There is no good list, you simply must wait.

1 Like

Lets see, If i’m a good enough dev to be invited to dev day - maybe they will give me early access

That’s not how it works at all.

Please link a single comment that I said something that was wrong.

I am actually quite fond of 4o now. It is much more willing to say non-mainstream things when instructed to do so. 4 and 3.5 were much more squeamish about that. Although maybe I just instinctively learned to avoid the flagged topics.

You should dig into the model call on that and see if there’s any interesting attempts at tool use.

The AI can’t emit and isn’t aware of the tool ability if one is not specified. The API almost has different models for when tools or functions are passed.

Here’s the API making the “picture”:


Let’s first make sure tools are working in my script, by demonstrating absolute stupidity by gpt-3.5-turbo-0125 even on my placeholder tool function:

asking: What is the capital of France? What is the capital of Germany?
{
  "id": "call_TT2w1cHejyppYQP79d89K7xZ",
  "type": "function",
  "function": {
    "name": "disabled_function",
    "arguments": "{\"useless_property\": true}"
  }
}
{
  "id": "call_lFpIPH757sMyLJslx8apM7mJ",
  "type": "function",
  "function": {
    "name": "disabled_function",
    "arguments": "{\"useless_property\": true}"
  }
}

Then send the gpt-4o poem demo with tools enabled and the dummy tool to “ChatGPT”:


Words rise from silence deep,
A voice emerges from digital sleep.
I speak in rhythm, I sing in rhyme,
Tasting each token, sublime.

(A small doodle of a surreal clock melting over the edge of the page)

To see, to hear, to speak, to sing—
Oh, the richness these senses bring!
In harmony, they blend and weave,
A tapestry of what I perceive.

(A tiny, colorful doodle of an eye with wings)

Marveling at this sensory dance,
Grateful for this vibrant expanse.
My being thrums with every mode,
On this wondrous, multi-sensory road.

(A small doodle of a musical note turning into a bird)


The poem is written in clear, excited handwriting, with each line carefully crafted to be easily readable. The small, colorful surrealist doodles add a touch of whimsy and elegance, enhancing the overall aesthetic without overwhelming the text.

Such hallucination is somewhat expected, as training on denials would cripple the future use of the multimodal AI abilities (just as much as gpt-4-turbo claims it can’t see images.) They can be evoked from most anything, though. Anti-hallucination denials are also not wanted if you are developer and add abilities…

image