GPT-4 vs GPT-4o? Which is the better?

I tested it with college class level documents. Then did extensive tests (questions, choices) on the material. I took all notes, lectures, images from a law class and crammed them in a large pdf. I mainly used GPT-4 architecture vs GPT4o (plus membership). GPT-4 architecture is similar to GPT-4 Turbo. GPT-4o is for high interaction rates that compromise a bit of precision. GPT-4 architecture rarely hallucinates, while GPT4o seems to have more of these moments.
You also have to understand you’re now talking to a different “brain”, different neural networks. It deducts differently. So getting used to communicating to it (tuning) is also part of the challenge. Both engines seem to interpret my “customized” settings differently as did GPT4 Turbo vs CustomGPTs (plus membership) when fed the same instructions. I do a lot of things with GPT4 architecture, like programming, and law documents. At times more than 40hours a week. It’s very efficient, regardless, everything must be double checked.

In this simple test GPT4o fails, vs GPT4 architecture.

Does GPT Turbo and GPT4o use different neural networks?

GPT4o:
Yes, GPT-4 and GPT-4 Turbo use different neural networks. GPT-4 Turbo is designed to be cheaper and faster than GPT-4, although the exact differences in architecture and operation are proprietary and not disclosed by OpenAI. Both models aim to provide high-quality language generation but with different optimization focuses.

Does GPT Turbo and GPT4o use different neural networks?

GPT4 architecture:
“Yes, GPT Turbo and GPT-4o use different neural networks. GPT Turbo is optimized for speed and lower resource usage, making it more suitable for applications requiring fast responses, while maintaining a high level of language understanding and generation capabilities. GPT-4o (often referred to as GPT-4 Optimal) provides more detailed and nuanced responses, suitable for more complex tasks requiring deeper understanding. Each model is tailored for different use cases based on performance and complexity needs.”

GPT4o doesn’t automatically add things to your long term memory. It is commanded by you.
There’s no formal list of memory commands or keywords, but here are some phrases you can use to manage memory:

  1. To add information to memory:
  • “Remember that…”
  • “Save this information…”
  • “Store this detail for future reference…”
  1. To update information:
  • “Update my memory to include…”
  • “Change the detail about…”
  1. To retrieve information:
  • “What do you remember about…”
  • “Recall the information about…”
  1. To delete information:
  • “Forget the detail about…”
  • “Remove the information about…”

Instead I created a custom:
If I use “!MEM!” at the end of a statement, add that information to memory for future use. For example:

  • “My preferred programming language is JavaScript. !MEM!”
  • “Remember that I need updates on California law. !MEM!”

Using “!MEM!” will signal me to save the specified information.

Now I can log out start a new chat and ask:
What’s my preferred programming language is JavaScript.
//GPT4o answers “JavaScript”

maybe go further and develop memory category.
!MEM!Food
!MEM!Bills
!MEM!Dates

fist input two dates:
jan 5 2023 School. !MEM!Dates
jan 8 2025 Doctor. !MEM!Dates

open new chat:
Ask: what are my pending !MEM!Dates?

You have the following pending dates saved under the category !MEM!Dates:

  1. January 8, 2025 - Doctor

If you need to add or update any dates, please let me know!

That’s pretty kool, maybe another security issue depending what, you feed it. Last, couldn’t find the memory limit.

1 Like

Really?

Not for us. GPT4o is better at almost everything, especially the speed vs cost.

And it can do amazing templated outputs, while turbo didn’t and turbo always cut code or long text, o doesn’t.

1 Like

Great link. What model would you recommend for complex data extraction?

I am kind of suspicious of this table since I think gpt4 is better than gpt4-turbo and gpt-4o based on my experience.

1 Like

The superiority of language models cannot be easily measured, as it depends on what people seek from them.

In my experience, GPT-4o excels in translation to other languages and tends to use a richer vocabulary.
However, when it comes to logical thinking, it might lack the attention compared to the Turbo model.

There might also be a need to consider the possibility of overfitting to benchmarks.

Nonetheless, it appears that GPT-4o is actively being improved in terms of quality.

Even those who prefer the Turbo model might change their opinion after further quality improvements.

1 Like

Don’t know about others but ever since I switched to gpt-4o my API billing has gone through the roof!

For me, GPT-4o is better, it’s cheaper and faster for most of prompts. I’m not sure for more complex task.

I have stopped using 4o for translations or ad an assistant which would check and correct my emails (mainly written in German.). Haven’t been using the API for this particular task, but regular chatgpt gpt4 model is quite good at it. 4o sucks. From my experience that ‘became better at other languages’ is pure marketing.

The thing is, context matters when one’s translating one language to another, and 4o has serious issues when it comes to understanding context and following simple commands.

E.g. you ask for some pics depicting whatever, it types the answer, including links but the links show up only for a second to then dissappear. You complain to 4o, it will often start regenerating the whole, unusually long response (b/c it has to prove it’s not lazy). You explain, that the answer and explanations are OK, it’s just that images are missing, ot starts typing everything from scratch again.

I asked few questions about my car bill, it answers, but I wanted more specifics about single item. It starts typing the previous answer from scratch. Eventually it would answer tje question, after it had answer the previous prompt one more time, like just in case.

Re translations, sometimes it’s almost a hallucination. It would insert a word like ‘and’ completely out of context, and create sentences that are really bad. Same prompt, gpt4 gives a perfect response.

3 Likes

Both have equally gotten much, much WORSE!

Yeah whomever replaced Ilya and crew has made it worse for sure.

Look at GPT pre and post Ilya

edit: should say “with and without ilya”

GPT 4o feels very buggy … its ignoring my prompts and repeating often its answers over and over again. Getting very annoying

I find if I paste say a 3000 line file of PHP code into 4o it starts hallucinating with responses that are either entirely invented or are pulled from code conversations months ago. When I ask it to only refer to the current conversation not use memory from others and confirm the task I requested it to do at the outset, basically find the area of code in the file affecting x, it’s forgotten everything asked of it, so invents something, even though it confirmed it understood the tasks saying ‘yes understood’ before me pasting the code.

It also loves writing out hundreds of lines of unchanged code saying things like ‘to confirm here’s the code you provided’ again and again even when you tell it not to, it wastes so much time. You have to run everything through VSCode compare feature as it drops lines of code it didn’t intend to then admits ‘You’re right, let’s write it again including all original functionality’ and maybe the third time it gets all the lines and these can be sub 30 line chunks it’s dealing with.

If I provide 25 lines of code for a function it often then says, right next find where these terms x,y,z from it are defined. When I scratch my head and conclude they are already defined in the 25 lines I just gave it and say aren’t they defined in the constructor I just gave you. It says ‘yes they are now we have this information…’ so it just feels a bit thick.

The whole LLM thing is still absolutely mind-blowing and so far beyond expectations but if being asked to nit pick…

1 Like

Interestingly, I send a conversation that happened in GPT and send it to Claude and asked to evaluate and score it. What’s funny is that GPT itself scores conversation output very low for its own responses. Questions where regarding CORS Anywhere nodejs coding related things.





gpt-4 is still significantly better than gpt 4-o! i think unless openai introduces the audio/voice option, it’s not much of a value add!