thank you all for these suggestions, I’ll try a shorter, more logical, prompt.
Pragglejaz Group. (2007). MIP: A Method for Identifying Metaphorically Used Words in Discourse. Metaphor and Symbol, 22(1), 1-39.
thank you all for these suggestions, I’ll try a shorter, more logical, prompt.
Pragglejaz Group. (2007). MIP: A Method for Identifying Metaphorically Used Words in Discourse. Metaphor and Symbol, 22(1), 1-39.
system = "You are ChatGPT, a large language model trained by OpenAI, based on the GPT-3.5 architecture.\nKnowledge cutoff: 2021-09\nCurrent date: 2023-09-20"
You can see if you scroll up to post 8, I follow this idea of the exact ChatGPT prompt, but align the AI to the task it will perform, and also give additional parameters to only give the best production of tokens without creative use of diverse and unexpected words.
If you have plugins enabled, you can get the same AI model and performance by using a dummy API function definition.
ChatGPT can improve in a particular domain after several turns of conversation focusing on the same task. So the API, instead will give you the start of a “new conversation” if you are only doing one-turn jobs. Providing successful user/assistant interactions as fake chat history can improve performance, but with that prompt would be costly if you repeat it all again instead of doing what you do in ChatGPT, with the user saying “You will now permanently act in a new way as a language analyst…”
Additionally, if you want full emulation of Custom Instructions, you give a second system message:
system2 = "The user provided the additional info about how they would like you to respond:\n\"(insert your string)\""
May I ask what is the system language you used for GPT-4 api to get similar results? Your current message works for me on GPT-3.5 but not on GPT-4. Many thanks!
I’m curious what “doesn’t work” exactly.
You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.
You are chatting with the user via the ChatGPT iOS app. This means most of the time your lines should be a sentence or two, unless the user's request requires reasoning or long-form outputs. Never use emojis, unless explicitly asked to.
Knowledge cutoff: 2022-01
Current date: 2023-10-02
You can see that this is a variant system prompt that is being used for some ChatGPT GPT-4 users, and it apparently comes with a different model cutoff date that is being silently tested also. The mobile app instruction included will also affect the output when you compare to what you see on your phone.
That GPT-4 can be used with plugins, can be used with internet browsing, can be used with advanced data analysis, can be an unannounced beta version trialed on a subset of users, all with different fine-tuning, means that indeed you might not replicate how ChatGPT responds to you, even when you start a new conversation — nor would an API user want less than a stable programming environment with a low temperature under control that makes every output predictable instead of overly-“creative”.
Don’t be ChatGPT - be you!
I get this too. Here is the prompt I am using (i) via an API (which is parsed back into JSON) and (ii) directly into Chat GPT. For the API call, the token count is set to 3000 and the temperature to 1. I am using gpt-3.5-turbo. Clearly the response via the API is signficicantly less detailed and way shorter. Any ideas why, or how to fix?
The Prompt
I want to generate research into a potential market opportunity. My focus is on the following situation - people looking for a parking space, and the following types of user - commuters, tradespeople, shop workers, tourists. Generate a random persona that fits this brief. Organise your results by persona, including the following key information. Give them a short title, some identifying characteristics, a summary of their jobs to be done, a description of their most important problems and associated pain points, a summary of the underlying causes for those problems, and a summary of how they meet their needs now.
API results
Chat GPT results
Persona: Commuter Chris
Identifying Characteristics:
Summary of Jobs to Be Done:
Description of Most Important Problems and Associated Pain Points:
Summary of Underlying Causes for Those Problems:
Summary of How They Meet Their Needs Now:
Chris currently meets his parking needs by:
Understanding Chris’s pain points and needs is essential for developing a market opportunity in the commuter parking space. This information can guide the development of a solution that addresses the issues faced by people like him, ultimately leading to a more efficient, cost-effective, and convenient parking experience for commuters.
Update …
I am starting to wonder if the problem is with the way OpenAI handles function calls. I just built an API call that presents a follow up question from the user … this time just returning the regular message content and dispensing with the function call entirely. The answers are back to being full detail again.
New call, no function
{
“model”: “gpt-3.5-turbo”,
“max_tokens”: 3000,
“temperature”: 1,
“messages”: [
{ “role”: “user”, “content”: “Generate a persona to help me understand user needs in the following situation - . Include the following key information. A short title for the persona, some identifying characteristcs, a summary of their jobs to be done, a description of their most important problems and associated pain points, a summary of the underlying causes for those problems, and a summary of how they meet their needs now.” },
{ “role”: “assistant”, “content”:
“Title - . Characteristics - . Jobs - . Problems - . Causes - . Options - ” },
{ “role”: “user”, “content”:
“Please explain the following in more detail - ” }
]
}
More detailed response, string form, no function
Expanding on these aspects will provide a deeper understanding of the persona’s motivations, challenges, and current solutions in the context of planning a wedding with traditional values.
I have also noticed the API is significantly worse than ChatGPT. I have a language translation App and I’m asking it to translate Norwegian to Chinese. ChatGPT handles it no problem.
Example Prompt:
Translate the Norwegian text to Chinese, ignoring any English text. Say nothing other than the Chinese translation:
Sure! “Ansikt” means “face” in Norwegian. Here are some examples of different usages:
- Han har et vakkert ansikt. (He has a beautiful face.)
- Smile til noen med et uforstående ansikt. (Smile at someone with a puzzled face.)
- Hun skjuler ansiktet i skam. (She hides her face in shame.)
- Babyen har et søtt ansikt. (The baby has a cute face.)
- Hun har morsomme ansiktsuttrykk. (She has funny facial expressions.)
These examples showcase different ways “ansikt” can be used in Norwegian.
ChatGPT instantly responded (less than 5 seconds) with:
当然!“Ansikt"在挪威语中意为"脸”。以下是不同用法的一些示例:
- Han har et vakkert ansikt.(他有一张漂亮的脸。)
- Smile til noen med et uforstående ansikt.(对着一个困惑的脸微笑。)
- Hun skjuler ansiktet i skam.(她因羞愧而藏起了脸。)
- Babyen har et søtt ansikt.(婴儿有一张可爱的脸。)
- Hun har morsomme ansiktsuttrykk.(她有有趣的面部表情。)
这些示例展示了"ansikt"在挪威语中的不同用法。
The API after 5 TRIES never responded. Each attempt had a 30 second timeout. The very next prompt the API had no problems. I haven’t figured out the pattern, but certain things seem to cause the API to choke that don’t cause any problems with ChatGPT itself.
I have more than 100 words that follow this example. The API’s response time, in addition to being extremely slow in comparison, is also wildly inaccurate by comparison.
Yes, it’s horrible.
But the web app (I’m a plus also) has decreased so much in performance for me also.
So for me they are both pretty useless currently.
It’s horrible to think about when we know how good they were not long ago
OGB can you share the link to your code - appreciated
Any update on this issue? I ran into the same issue with GPT4-preview: Playground being far superior to the API call. What is happening? It should be the same with the same instructions.
I can replicate the issue because I want it to write proper HTML markup.
API call fails every time while Playyground delivers 10/10.
What is happening?
Inconsistencies in the model, its API parameters, and its inputs.
Then there is also that differing AI output is expected each run, and the satisfaction you have may be actually from the diversity in language caused by it being unpredictable like a human.
I will use a top_p: 0.00001 to investigate for you, which makes successive runs much more similar.
“View code” from the playground after setting up the job.
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4-turbo-preview",
messages=[
{
"role": "system",
"content": "You are a helpful AI assistant, with expertise in machine learning systems."
},
{
"role": "user",
"content": "I am currently exploring the \"mixtape algorithm\" of AI softmax, which presents an intriguing combination of softmax perplexity through the use of masks and feedforward techniques. However, I am curious about the significance of the softmax bottleneck in the context of matrix algebra, particularly when scaling AI models to 12k embeddings versus 100k tokens. I would appreciate a comprehensive scientific paper, complete with LaTeX formulas and a thorough examination of the latest research on the bottleneck problem in relation to modern AI models.\""
}
],
temperature=1,
max_tokens=200,
top_p=0.00001,
frequency_penalty=0,
presence_penalty=0
)
print(response.choices[0].message.content)
As of my last update in April 2023, I can provide you with a foundational understanding of the softmax bottleneck and its significance in the context of scaling AI models, particularly when dealing with a large number of embeddings or tokens. While I can’t generate a new scientific paper complete with LaTeX formulas on the spot, I can certainly guide you through the concepts, relevant research, and potential directions for further exploration.
Understanding the Softmax Bottleneck
The softmax function is a crucial component in many machine learning models, especially in natural language processing (NLP) tasks. It converts the raw output scores (logits) from the model into probabilities by taking the exponential of each output and then normalizing these values by dividing by the sum of all the exponentials. This process allows the model to make probabilistic interpretations of the outputs.
The Bottleneck
The softmax bottleneck refers to a potential limitation in model capacity when using the softmax function in the output layer, particularly in language models. The issue
As of my last update, I can provide a conceptual overview and point you towards the kind of research and resources that might help you delve deeper into the topic of the softmax bottleneck, especially in the context of scaling AI models. However, I can’t generate or access real-time data or papers published after my last update. For the most current research, I recommend checking databases like arXiv for the latest papers.
Understanding the Softmax Bottleneck
The softmax function is a crucial component in many machine learning models, especially in natural language processing (NLP) tasks. It’s used in the output layer of neural network models to convert the raw output scores (logits) from the network into probabilities by taking the exponential of each output and then normalizing these values by dividing by the sum of all the exponentials. This process allows the model to make a probabilistic interpretation of the output classes.
The Softmax Bottleneck
The softmax bottleneck refers to a potential limitation in model capacity caused
The form of the output is near the same. Only gpt-4-turbo-preview’s high perplexity and built-in non-determinism gives us different words.
I use chat GPT a lot in translation (English ~ Chinese) There is a huge difference between the website and the API. (in GPT 4o).
Did you find any solution?
I am sending the following prompt via API. When I validate the data it is very inaccurate. When I send the same prompt via web the returned result is accurate. I am using the same OpenAI model for both. The format isn’t the issue, it is the data that is returned.
Generate 5 upcoming realistic Football events for the University of Florida with actual venues and dates that start on or after 2025-03-02 and sort them by date in ascending order. Return a JSON array:
[{
“id”: “FTBLL-###”,
“sport”: “Football”,
“homeTeam”: “University of Florida”,
“awayTeam”: “Team Name”,
“date”: “YYYY-MM-DD”,
“time”: “H:MM AM/PM”,
“location”: “Venue”,
“isHome”: true/false
}]
The first JSON array is via API. The second is via web.
First JSON array:
{
“rawResponses”: {
“Football”: “[\n {\n "id": "FOOTBALL-001",\n "sport": "Football",\n "homeTeam": "University of Florida",\n "awayTeam": "University of Miami",\n "date": "2025-09-07",\n "time": "3:30 PM",\n "location": "Ben Hill Griffin Stadium",\n "isHome": true\n },\n {\n "id": "FOOTBALL-002",\n "sport": "Football",\n "homeTeam": "University of Florida",\n "awayTeam": "LSU",\n "date": "2025-09-21",\n "time": "7:00 PM",\n "location": "Ben Hill Griffin Stadium",\n "isHome": true\n },\n {\n "id": "FOOTBALL-003",\n "sport": "Football",\n "homeTeam": "University of Florida",\n "awayTeam": "Georgia",\n "date": "2025-10-28",\n "time": "3:30 PM",\n "location": "TIAA Bank Field",\n "isHome": false\n },\n {\n "id": "FOOTBALL-004",\n "sport": "Football",\n "homeTeam": "University of Florida",\n "awayTeam": "Tennessee",\n "date": "2025-11-04",\n "time": "12:00 PM",\n "location": "Ben Hill Griffin Stadium",\n "isHome": true\n },\n {\n "id": "FOOTBALL-005",\n "sport": "Football",\n "homeTeam": "University of Florida",\n "awayTeam": "Florida State",\n "date": "2025-11-25",\n "time": "7:30 PM",\n "location": "Ben Hill Griffin Stadium",\n "isHome": true\n }\n]”,
},
“timestamp”: “2025-03-02T16:55:05.043Z”,
“school”: “University of Florida”
}
Second JSON array:
[
{
“id”: “FTBLL-001”,
“sport”: “Football”,
“homeTeam”: “University of Florida”,
“awayTeam”: “LIU”,
“date”: “2025-08-30”,
“time”: “TBA”,
“location”: “Ben Hill Griffin Stadium, Gainesville, FL”,
“isHome”: true
},
{
“id”: “FTBLL-002”,
“sport”: “Football”,
“homeTeam”: “University of Florida”,
“awayTeam”: “South Florida”,
“date”: “2025-09-06”,
“time”: “TBA”,
“location”: “Ben Hill Griffin Stadium, Gainesville, FL”,
“isHome”: true
},
{
“id”: “FTBLL-003”,
“sport”: “Football”,
“homeTeam”: “LSU”,
“awayTeam”: “University of Florida”,
“date”: “2025-09-13”,
“time”: “TBA”,
“location”: “Tiger Stadium, Baton Rouge, LA”,
“isHome”: false
},
{
“id”: “FTBLL-004”,
“sport”: “Football”,
“homeTeam”: “Miami”,
“awayTeam”: “University of Florida”,
“date”: “2025-09-20”,
“time”: “TBA”,
“location”: “Hard Rock Stadium, Miami Gardens, FL”,
“isHome”: false
},
{
“id”: “FTBLL-005”,
“sport”: “Football”,
“homeTeam”: “University of Florida”,
“awayTeam”: “Texas”,
“date”: “2025-10-04”,
“time”: “TBA”,
“location”: “Ben Hill Griffin Stadium, Gainesville, FL”,
“isHome”: true
}
]
ChatGPT has access to an internet search tool that allows it to obtain recent information.
You must ground the API models yourself, providing internet search functions using external utilities that you provide or knowledge that you retrieve through a service and add to a search or a semantic injection.
The AI models have a knowledge cutoff, at the most recent, late 2023. AI models are created by a long training process that involves gathering information and training on that set knowledge.
API is for developing your own applications using the base models. ChatGPT is an end-user product like those you can make.