Not getting best answer from GPT models

javidd · February 8, 2025, 3:30pm

Hello. We have created a wrapper chat assistant on top of gpt models. Everything is good, except one thing. We speak to some people and they say that Chatgpt gives better answers with same models.
Now I would like to know how can we also get better answers ? I suspect that we should use temperature, top_p, frequency_penalty parameters. If yes then how and with which values ?

Thanks !

anon10827405 · February 8, 2025, 5:22pm

What model are you using
What is meant by “better answers”?

javidd · February 8, 2025, 5:30pm

I have gpt-4o, 4o-mini, 4 turbo, o1-mini. Better answers - means like “chatgpt solves their problem”. example: my friend wrote chatgpt to write a blog and in the blog to mention special keywords. it did it.
they same we did in my chat, but it did not mention the keywords.

anon10827405 · February 8, 2025, 6:47pm

chatgpt-4o-latest is the model used by ChatGPT. If you have a large system prompt get rid of it

javidd · February 8, 2025, 7:49pm

It doesn’t support tool use unfortunately.

anon10827405 · February 8, 2025, 7:53pm

I’m not sure what exactly you’re looking for. ChatGPT doesn’t use tools for writing blog posts either unless you’re referring to search or actions.

Based on your link I’m assuming you are trying to build a auto-blog posting AI. You would be looking at agentic systems, not a single model.

You can use chatgpt-4o-latest for conversational abilities and then other powerful models for background tasks like research

javidd · February 8, 2025, 7:57pm

Just look to these 2 images. One from chatgpt another from my chatbot. both using gpt-4o models. But as you see chatgpt gave much better and detailed answer to this prompt: tell me why a^2+b^2=c^2
Hope I was able to explain the problem.

anon10827405 · February 8, 2025, 7:59pm

You’re not using chatgpt-4o-latest in your below example.

In either case, you can probably just ask the model to be more verbose and in-depth with it’s responses. Although this is not something you should be configuring at a system level, but instead letting the user configure.

javidd · February 8, 2025, 8:07pm

I tried that model. it gives the same result. chatgpt-4o-latest points to gpt-4o. So it doesn’t matter what model name you write in code. The question - why chatgpt gives better answer. I assume there is something should be done in config or system prompt. The prompt I gave is just one of examples.

anon10827405 · February 8, 2025, 8:10pm

chatgpt-4o-latest does not point to gpt-4o. In your own screenshot you will see that it specifically says “GPT-4o used in ChatGPT”

If someone says that ChatGPT is better than your model, then the closest thing you can do is use their model, and then let the user adjust their prompt.

I’m not sure what kind of answer you’re looking for here. Yes, adjust your prompt, yes, adjust your settings. Although if this is a service you’re offering then you should give these controls to the users.

javidd · February 8, 2025, 8:20pm

chatgpt-4o-latest points to gpt-4o . I am trying to find the answer why the same model in chatgpt gives better answer.

anon10827405 · February 8, 2025, 8:39pm

I don’t think you’re fully understanding what the docs are trying to say.

chatgpt-4o-latest points to the same model used by ChatGPT. It’s a derivative of gpt-4o designed to be more communicative.

I don’t understand why you think it would just simply point to gpt4o, why would it exist then?

Your initial post was “My user says ChatGPT is better”. I am saying to you, the first step to be closer to ChatGPT is to use the model.

The first step is to use the same model that ChatGPT does. At least try it out and see the differences.

I can maybe understand because the docs really don’t highlight the differences, but here is a Twitter post about it:

Michelle Pokrass (@michpokrass) on X

@aidan_mclau @OpenAIDevs chatgpt-4o-latest will track our 4o model in chatgpt, and is a chat-optimized model. our model from last week (gpt-4o-2024-08-06) is optimized for api usage (eg. function calling, instruction following) so we generally recommend that!

chatgpt-4o-latest is a chat-optimized model that is used by ChatGPT
gpt-4o-[] is optimized for API-usage

Could it be that overtime they have converged? Maybe? But it’s a good first step to matching the quality of ChatGPT.

ipxsdev · February 9, 2025, 3:00am

Based on both your topic title and the example images you provided, it seems like you’re looking for a preference-based response from the model. I say this because terms like “better” and “best” are subjective and depend on individual preferences. What one person finds well-structured and effective may not be as appealing to someone else.

With that in mind, if your preferred structured format aligns with the image you shared, I agree with @anon10827405 regarding the importance of giving clear, specific instructions to the model to ensure a structured output. Personally, I achieve this by explicitly defining how responses should be formatted and by creating a knowledge base that includes a reference template for the model to follow. In my experience, this approach generally yields more consistent and structured responses compared to allowing the model to generate replies without strict formatting guidelines.

That aside, you may want to consider providing your user base with a preferred prompting structure or guide. This can help ensure they receive “better” or more effective and structured responses compared to the outputs they are currently getting.

Topic		Replies	Views
Differences between text-davinci-003 model and ChatGpt Prompting	3	6167	February 12, 2023
Chatgpt API isn't good as it's website Prompting api , prompt	3	6873	January 11, 2024
Different output generated for same prompt in chat mode and API mode using gpt-3.5-turbo API gpt-35-turbo , chatgpt , api	16	10589	December 18, 2023
Inconsistent Outputs Between GPT-3.5 API and ChatGPT Even After Adjusting Temperature Prompting gpt-35-turbo , chatgpt , api	4	2740	December 18, 2023
Results from ChatGPT are quite better than API API	5	3001	March 30, 2024

Not getting best answer from GPT models

Michelle Pokrass (@michpokrass) on X

Related topics