LLM Output Differences: ChatGPT Plus vs API – Context Window & Prompt Adherence

I’m developing an LLM application using OpenAI’s API, and I’ve noticed some differences between using ChatGPT Plus/Pro via the web UI and accessing the models through the API. While it’s expected that the output might vary, I also feel that the context window behaves differently.

Through multiple tests, it seems that the API version tends to be more restrictive, with prompts not being fully reflected and the output being more concise than expected.

Have any other developers experienced similar issues? If so, how do you handle this? Are there specific techniques or settings that help improve prompt adherence and increase output length when using the API?

@gtantan - Welcome to the forums!

increase output length when using the API

  • You can increase the top-p value, which raises the threshold to include lower-probability tokens.

Are there specific techniques or settings that help improve prompt adherence

I’ve transcribed your question over to another business case.

“I want to open a hot dog stand in front of Costco as a business venture. How can I ensure I get the exact same hot dogs that Costco sells for $1.50 at their deli?”

ChatGPT uses different models than the API, and that is for the best. There is only one “system-you are ChatGPT” message in the general-purpose consumer product, but API app developers need models that readily take to permanent instruction-following (yet rarely get their wish).

The default top_p is 1.0. Meaning “no effect”. You cannot increase this parameter for a length benefit, but you can reduce it for less creative token selections during generation, getting more of the “best words”.

You will just need to prompt the model into the type of output you desire. You have control of the developer message contents: use it. Make the AI your product where Joe Bob Bucky bought the VIP lifetime membership upgrading to 10000 word response lengths.

1 Like