Differences in results between API calls content results and browser content results gpt4

thank you all for these suggestions, I’ll try a shorter, more logical, prompt.

Pragglejaz Group. (2007). MIP: A Method for Identifying Metaphorically Used Words in Discourse. Metaphor and Symbol, 22(1), 1-39.

system = "You are ChatGPT, a large language model trained by OpenAI, based on the GPT-3.5 architecture.\nKnowledge cutoff: 2021-09\nCurrent date: 2023-09-20"

You can see if you scroll up to post 8, I follow this idea of the exact ChatGPT prompt, but align the AI to the task it will perform, and also give additional parameters to only give the best production of tokens without creative use of diverse and unexpected words.

If you have plugins enabled, you can get the same AI model and performance by using a dummy API function definition.

ChatGPT can improve in a particular domain after several turns of conversation focusing on the same task. So the API, instead will give you the start of a “new conversation” if you are only doing one-turn jobs. Providing successful user/assistant interactions as fake chat history can improve performance, but with that prompt would be costly if you repeat it all again instead of doing what you do in ChatGPT, with the user saying “You will now permanently act in a new way as a language analyst…”

Additionally, if you want full emulation of Custom Instructions, you give a second system message:

system2 = "The user provided the additional info about how they would like you to respond:\n\"(insert your string)\""

May I ask what is the system language you used for GPT-4 api to get similar results? Your current message works for me on GPT-3.5 but not on GPT-4. Many thanks!

I’m curious what “doesn’t work” exactly.

You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.
You are chatting with the user via the ChatGPT iOS app. This means most of the time your lines should be a sentence or two, unless the user's request requires reasoning or long-form outputs. Never use emojis, unless explicitly asked to.
Knowledge cutoff: 2022-01
Current date: 2023-10-02

You can see that this is a variant system prompt that is being used for some ChatGPT GPT-4 users, and it apparently comes with a different model cutoff date that is being silently tested also. The mobile app instruction included will also affect the output when you compare to what you see on your phone.

That GPT-4 can be used with plugins, can be used with internet browsing, can be used with advanced data analysis, can be an unannounced beta version trialed on a subset of users, all with different fine-tuning, means that indeed you might not replicate how ChatGPT responds to you, even when you start a new conversation — nor would an API user want less than a stable programming environment with a low temperature under control that makes every output predictable instead of overly-“creative”.

Don’t be ChatGPT :robot: - be you! :woman_technologist:t2:

1 Like

I get this too. Here is the prompt I am using (i) via an API (which is parsed back into JSON) and (ii) directly into Chat GPT. For the API call, the token count is set to 3000 and the temperature to 1. I am using gpt-3.5-turbo. Clearly the response via the API is signficicantly less detailed and way shorter. Any ideas why, or how to fix?

The Prompt
I want to generate research into a potential market opportunity. My focus is on the following situation - people looking for a parking space, and the following types of user - commuters, tradespeople, shop workers, tourists. Generate a random persona that fits this brief. Organise your results by persona, including the following key information. Give them a short title, some identifying characteristics, a summary of their jobs to be done, a description of their most important problems and associated pain points, a summary of the underlying causes for those problems, and a summary of how they meet their needs now.

API results

Chat GPT results
Persona: Commuter Chris

Identifying Characteristics:

  • Name: Chris Turner
  • Age: 32
  • Gender: Male
  • Marital Status: Married
  • Occupation: Software Developer
  • Location: Suburban area, 30 minutes from the city center
  • Vehicle: Compact sedan

Summary of Jobs to Be Done:

  1. Finding Reliable Daily Parking: Chris needs a reliable, cost-effective parking solution close to the train station for his daily commute to the city.
  2. Time Efficiency: He values his time and wants a solution that minimizes the time spent searching for parking.
  3. Cost Savings: Chris wants to reduce his monthly parking expenses without compromising security and convenience.
  4. Convenience: He needs a seamless parking experience, allowing him to easily locate and pay for parking spaces.

Description of Most Important Problems and Associated Pain Points:

  1. Limited Availability: Chris often struggles to find an available parking space, particularly during peak hours, which causes stress and delays.
  2. High Parking Costs: Monthly parking rates near the train station are quite high, straining his budget.
  3. Inconvenient Payment Methods: Existing parking options often require cash payments or complicated mobile apps, making it inconvenient for him to pay for parking.
  4. Safety Concerns: He worries about the safety of his vehicle when parked in certain areas.

Summary of Underlying Causes for Those Problems:

  1. Limited Parking Infrastructure: The train station has insufficient parking infrastructure to accommodate all commuters during peak hours.
  2. High Demand for Limited Spaces: The high demand for parking near the train station results in inflated parking prices.
  3. Outdated Payment Systems: Existing parking solutions lack user-friendly payment methods, making it inconvenient for users like Chris.
  4. Security Concerns: Certain parking areas are poorly lit or lack security measures, raising safety concerns.

Summary of How They Meet Their Needs Now:
Chris currently meets his parking needs by:

  • Arriving at the train station very early to secure a parking spot.
  • Paying the expensive monthly parking fee, which affects his budget.
  • Using a mobile app to pay for parking, even though he finds it inconvenient.
  • Occasionally parking further from the station to save money, which results in a longer walk and safety concerns.

Understanding Chris’s pain points and needs is essential for developing a market opportunity in the commuter parking space. This information can guide the development of a solution that addresses the issues faced by people like him, ultimately leading to a more efficient, cost-effective, and convenient parking experience for commuters.

Update …

I am starting to wonder if the problem is with the way OpenAI handles function calls. I just built an API call that presents a follow up question from the user … this time just returning the regular message content and dispensing with the function call entirely. The answers are back to being full detail again.

New call, no function
“model”: “gpt-3.5-turbo”,
“max_tokens”: 3000,
“temperature”: 1,
“messages”: [
{ “role”: “user”, “content”: “Generate a persona to help me understand user needs in the following situation - . Include the following key information. A short title for the persona, some identifying characteristcs, a summary of their jobs to be done, a description of their most important problems and associated pain points, a summary of the underlying causes for those problems, and a summary of how they meet their needs now.” },
{ “role”: “assistant”, “content”:
“Title - . Characteristics - . Jobs - . Problems - . Causes - . Options - ” },
{ “role”: “user”, “content”:
“Please explain the following in more detail - ” }

More detailed response, string form, no function

  1. Identifying Characteristics:
  • Female: Important to note the persona’s gender as wedding planning often involves gender-specific roles and responsibilities.
  • 30 years old: Age plays a role in shaping one’s values, expectations, and the level of experience in attending weddings.
  • Works full-time: This indicates that the persona has limited time available for wedding planning and may need efficient solutions.
  • Conservative, traditional values: The persona prioritizes maintaining and adhering to established customs, rituals, and family traditions, which significantly influence decision-making during wedding planning.
  1. Jobs to be Done:
  • Planning a classic wedding: The persona aims to create a wedding that embodies traditional elements and follows societal expectations associated with weddings.
  • Maintaining family traditions: The persona seeks to incorporate specific cultural or family customs into the wedding to uphold the values and expectations of their family.
  1. Problems and Pain Points:
  • Limited options for traditional venues and vendors: The persona may struggle to find wedding venues and vendors that align with their traditional aesthetic and requirements.
  • Dealing with family expectations: The persona experiences pressure from their family to adhere to specific wedding traditions, leading to potential conflicts and anxieties.
  1. Underlying Causes for Problems:
  • Lack of access to traditional venues and vendors: Traditional venues and vendors may be limited in number or availability, making it challenging for the persona to find suitable options.
  • Societal pressure to conform to family traditions: External societal expectations and family pressure can restrict the persona’s choices and create tension between personal desires and family expectations.
  1. How They Meet Their Needs Now:
  • Relies on recommendations from family and friends: The persona seeks advice and suggestions from trusted individuals who have experience with traditional weddings.
  • Searches for traditional wedding venues and vendors: The persona conducts online searches or visits wedding expos to identify and connect with traditional vendors and venues.

Expanding on these aspects will provide a deeper understanding of the persona’s motivations, challenges, and current solutions in the context of planning a wedding with traditional values.

I have also noticed the API is significantly worse than ChatGPT. I have a language translation App and I’m asking it to translate Norwegian to Chinese. ChatGPT handles it no problem.

Example Prompt:

Translate the Norwegian text to Chinese, ignoring any English text. Say nothing other than the Chinese translation:

Sure! “Ansikt” means “face” in Norwegian. Here are some examples of different usages:

  1. Han har et vakkert ansikt. (He has a beautiful face.)
  2. Smile til noen med et uforstående ansikt. (Smile at someone with a puzzled face.)
  3. Hun skjuler ansiktet i skam. (She hides her face in shame.)
  4. Babyen har et søtt ansikt. (The baby has a cute face.)
  5. Hun har morsomme ansiktsuttrykk. (She has funny facial expressions.)

These examples showcase different ways “ansikt” can be used in Norwegian.

ChatGPT instantly responded (less than 5 seconds) with:


  1. Han har et vakkert ansikt.(他有一张漂亮的脸。)
  2. Smile til noen med et uforstående ansikt.(对着一个困惑的脸微笑。)
  3. Hun skjuler ansiktet i skam.(她因羞愧而藏起了脸。)
  4. Babyen har et søtt ansikt.(婴儿有一张可爱的脸。)
  5. Hun har morsomme ansiktsuttrykk.(她有有趣的面部表情。)


The API after 5 TRIES never responded. Each attempt had a 30 second timeout. The very next prompt the API had no problems. I haven’t figured out the pattern, but certain things seem to cause the API to choke that don’t cause any problems with ChatGPT itself.

I have more than 100 words that follow this example. The API’s response time, in addition to being extremely slow in comparison, is also wildly inaccurate by comparison.


Yes, it’s horrible.

But the web app (I’m a plus also) has decreased so much in performance for me also.

So for me they are both pretty useless currently.

It’s horrible to think about when we know how good they were not long ago :sob:


OGB can you share the link to your code - appreciated

Any update on this issue? I ran into the same issue with GPT4-preview: Playground being far superior to the API call. What is happening? It should be the same with the same instructions.
I can replicate the issue because I want it to write proper HTML markup.
API call fails every time while Playyground delivers 10/10.

What is happening?

Inconsistencies in the model, its API parameters, and its inputs.

Then there is also that differing AI output is expected each run, and the satisfaction you have may be actually from the diversity in language caused by it being unpredictable like a human.

I will use a top_p: 0.00001 to investigate for you, which makes successive runs much more similar.

“View code” from the playground after setting up the job.

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
      "role": "system",
      "content": "You are a helpful AI assistant, with expertise in machine learning systems."
      "role": "user",
      "content": "I am currently exploring the \"mixtape algorithm\" of AI softmax, which presents an intriguing combination of softmax perplexity through the use of masks and feedforward techniques. However, I am curious about the significance of the softmax bottleneck in the context of matrix algebra, particularly when scaling AI models to 12k embeddings versus 100k tokens. I would appreciate a comprehensive scientific paper, complete with LaTeX formulas and a thorough examination of the latest research on the bottleneck problem in relation to modern AI models.\""


As of my last update in April 2023, I can provide you with a foundational understanding of the softmax bottleneck and its significance in the context of scaling AI models, particularly when dealing with a large number of embeddings or tokens. While I can’t generate a new scientific paper complete with LaTeX formulas on the spot, I can certainly guide you through the concepts, relevant research, and potential directions for further exploration.

Understanding the Softmax Bottleneck

The softmax function is a crucial component in many machine learning models, especially in natural language processing (NLP) tasks. It converts the raw output scores (logits) from the model into probabilities by taking the exponential of each output and then normalizing these values by dividing by the sum of all the exponentials. This process allows the model to make probabilistic interpretations of the outputs.

The Bottleneck

The softmax bottleneck refers to a potential limitation in model capacity when using the softmax function in the output layer, particularly in language models. The issue


As of my last update, I can provide a conceptual overview and point you towards the kind of research and resources that might help you delve deeper into the topic of the softmax bottleneck, especially in the context of scaling AI models. However, I can’t generate or access real-time data or papers published after my last update. For the most current research, I recommend checking databases like arXiv for the latest papers.

Understanding the Softmax Bottleneck

The softmax function is a crucial component in many machine learning models, especially in natural language processing (NLP) tasks. It’s used in the output layer of neural network models to convert the raw output scores (logits) from the network into probabilities by taking the exponential of each output and then normalizing these values by dividing by the sum of all the exponentials. This process allows the model to make a probabilistic interpretation of the output classes.

The Softmax Bottleneck

The softmax bottleneck refers to a potential limitation in model capacity caused

The form of the output is near the same. Only gpt-4-turbo-preview’s high perplexity and built-in non-determinism gives us different words.