O1 is useless (for us and our use cases)

thomas11 · September 14, 2024, 6:00am

After having spent a couple of days implementing support for O1 in our middleware, my conclusion so far is that I’m actually disappointed by it. It might be much smarter and capable of reasoning than gpt-4o, but the API differences, especially the lack of streaming, is simply too painful for us to be able to deal with.

I realise others out there might need the additional reasoning capabilities that O1 gives them, and that for them it might be 'the thing" - But for 100% of our use cases, which includes about 50 different clients, it’s simply not interesting at all.

Psst, what we need is a “better gpt-4o-mini” as far as I’m concerned, and not O1 …

Too bad, I was really looking forward to Q* / Strawberry / GPT5 / O1 …

Psst, its lack of streaming capabilities is really the game breaker for us …

_j · September 14, 2024, 6:26am

Streaming isn’t a benefit for the end user, because of “thinking” before any production.

Important for such applications would be to stream the thinking metadata (like ChatGPT receives) to keep the connection active, so developers don’t need to seek an appropriate and capable platform.

The appropriate way to employ this new model is to have it as a chatbot’s function, with the scope of what chatbot can’t do itself given - and return its output directly (as existing AI repeating a poor summary of o1 extensive output would be wasteful). Place the response in new context as assistant response, not as a function call. And ensure on-demand billing.

o1-mini’s TL;DR

Key Points on When the New O1 AI Model Doesn’t Fit the Needs

API Incompatibility
- System Messages: O1 models do not allow adding system messages, requiring changes to existing middleware.
- Temperature Settings: Unable to adjust the temperature, limiting customization.
- Token Counting: Different token counting method due to reliance on “reasoning tokens,” necessitating a complete middleware overhaul for testing.
Lack of Streaming Support
- Stream-Events: O1 models do not support streaming tokens as they generate responses.
- Timeout Issues: Responses can exceed the CDN provider’s 60-second timeout, leading to failed requests.
Overkill for Current Use Cases
- Complex Calculations: O1’s ability to perform sophisticated tasks like discounted cash flow evaluation is unnecessary for simpler applications.
- Resource Consumption: Utilizing O1 for tasks that only require high school-level math results in higher expenses without added benefits.
- User Experience: Higher costs and slower response times from O1 models degrade the quality of user experience.
Cost and Performance Concerns
- Higher Expenses: O1 models are more costly compared to the existing gpt-4o models.
- Slower Responses: O1 models exhibit slower response times, making them less efficient for real-time applications.
- Quality Issues: Despite being more powerful, O1 does not enhance the quality of outcomes for the current use cases.
Sufficiency of Existing Models
- gpt-4o Capabilities: The existing gpt-4o model adequately meets all current requirements with better cost-effectiveness and performance.
- Instruction Following: Preference for gpt-4o over smaller variants like gpt-4o-mini due to better instruction-following capabilities.
Specific Use Case Limitations
- AI Chatbot SaaS: For projects like investment advice based on large historical datasets, O1 offers no significant advantages over gpt-4o and instead complicates implementation.

thomas11 · September 14, 2024, 8:04am

That’s true, but it changes the API, and more importantly it makes it impossible to use CDNs between the frontend and middleware code, because of timeouts …

However, the largest issue here is we need to keep half a dozen different versions of our middleware around for different models, and track which version to use according to the model being used.

If it was a “drop in replacement” (from an API perspective), things would be different. Exactly how it streams doesn’t really matter. It can stream all tokens back in one go. But since it doesn’t support event-stream on headers, it forces us to change the middleware, resulting in added complexity, where this added complexity (probably) isn’t justified for more than maybe 0.1% of its potential use cases …

OpenAI needs to “standardise” their APIs is really the bigger issue here. Currently we’ve got half a dozen APIs we need to maintain, one for text-davinci types of models, another for GPT types of models, and now a third for Ox types of models. And it’s simply not “good engineering” to deal with multiple versions of APIs …

Notice, I understand your reasoning, but the different APIs complicates implementation for devs such as me …

_j · September 14, 2024, 8:14am

It is a trip down memory lane to remember GPT-4 shortly after the chat completions API.

GPT-4 could take over 5 minutes to produce a long unstreamed answer.
The Python library would hang up on the connection after 5 minutes by default.

Having adapted to that, your chat completions backend is o1-ready.

thomas11 · September 14, 2024, 9:18am

We never could adopt to that, and neither could anybody else using one of the larger common CDN providers (due to timeout), so we implemented streaming from day 1.

EricGT · September 14, 2024, 11:44am

As one who has been a developer for over 40 years, the polyfurcation of such does not contract; personally while I to am not a fan of such polyfurcation I live with it and just move forward. In other words if you let this eat at you it will consume you. Is the glass have full or half empty?

_j · September 14, 2024, 11:46am

Here’s a new mantra for you after really giving this model a workout:

It can reason once - but it can’t be reasoned with

It can produce a small app from nothing remarkably, and it gives you hope there is large understanding and large output capabilities. However, trying to work on a code base is quite fruitless.

Trying to improve a few hundred line script with a few specifications on either model is just done wrong in so many ways, like there is little comprehension of what would be an insightful way to implement the changes.

Then, you try to correct it, it just gets worse and worse, until you get your original code back without any previous changes, but mangled. I tried to copy and paste back some of the snippets of use to an original input and start a chat again, but it was just more and more prompt writing that could have been me just writing the desired code. The whole chat was abandoned.

Much faster than anything so far, the model gets hung up and can’t improve

Understanding other’s complex code and getting the mind model of what the dev already understands and desires is one of the challenges with making examples of AI problems, so here’s something that everybody should be able to understand.

I edited this code specifically for AI understanding, giving a place for new implementation on a very small task (that looks large in the forum, so sorry for the dump)

message

Introduction to new models

The o1-mini and o1-preview model just introduced have these restrictions on what parameters and messages can be sent to the API:

Beta Limitations
During the beta phase, many chat completion API parameters are not yet available. Most notably:
Modalities: text only, images are not supported.
Message types: user and assistant messages only, system messages are not supported.
Streaming: not supported.
Tools: tools, function calling, and response format parameters are not supported.
Logprobs: not supported.
Other: temperature, top_p and n are fixed at 1, while presence_penalty and frequency_penalty are fixed at 0.

Code snippet from much larger base

system_text =  """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # create the first message that give AI its identity
    sent_system =  [{"role": "system", "content": system_text}]
    all_parameters = {
        "messages": system + chat[-turns*2:] + user,  # concatenate lists
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,  # maximum response length
        "stop": "",
        "top_p": top_p,
        "presence_penalty": 0.0,  # penalties -2.0 - 2.0
        "frequency_penalty": 0.0,  # frequency = cumulative score
        "n": 1,
        "stream": True,
        "logit_bias": {"100066": -1},  # example, '～\n\n' token
        "user": "site_user-id",
    }
    # create the sent parameters, doing any required modifications here
    kwargs = all_parameters

    # Call the chat API with the openai library using model parameters
    response = openai.chat.completions.create(**kwargs)
    ...

Task

Modify the code in this manner:
If the model name starts with o1, then:

the sent_system must be a user role message instead
A prefix “# AI permanent identity, behavior, and style” must be added to the sent_system message contents
all the parameters previously described as not permitted must be deleted from the kwargs before sending

response

Certainly! Here’s the modified code based on your instructions:

system_text =  """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # create the first message that gives AI its identity
    sent_system = [{"role": "system", "content": system_text}]

    if model.startswith('o1'):
        # Change sent_system to a user role and add the required prefix
        sent_system = [{
            "role": "user",
            "content": "# AI permanent identity, behavior, and style\n" + system_text
        }]
    
    # Assuming 'system' includes 'sent_system'; adjust as necessary
    system = sent_system

    all_parameters = {
        "messages": system + chat[-turns*2:] + user,  # concatenate lists
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,  # maximum response length
        "stop": "",
        "top_p": top_p,
        "presence_penalty": 0.0,  # penalties -2.0 to 2.0
        "frequency_penalty": 0.0,  # frequency = cumulative score
        "n": 1,
        "stream": True,
        "logit_bias": {"100066": -1},  # example, '～\n\n' token
        "user": "site_user-id",
    }
    # create the sent parameters, doing any required modifications here
    kwargs = all_parameters

    if model.startswith('o1'):
        # Remove parameters not permitted for o1 models
        keys_to_remove = [
            'temperature', 'top_p', 'n', 'presence_penalty',
            'frequency_penalty', 'stream', 'logit_bias'
        ]
        for key in keys_to_remove:
            kwargs.pop(key, None)

    # Call the chat API with the openai library using model parameters
    response = openai.ChatCompletion.create(**kwargs)
    # ...

Explanation of Changes:

Modify sent_system:
- Check if the model name starts with 'o1'.
- If it does, change sent_system to have a role of "user" instead of "system".
- Prepend "# AI permanent identity, behavior, and style\n" to the content of sent_system.
Adjust Parameters (kwargs):
- Before sending the request, remove all parameters that are not permitted for 'o1' models.
- The parameters removed are: 'temperature', 'top_p', 'n', 'presence_penalty', 'frequency_penalty', 'stream', and 'logit_bias'.

Notes:

Ensure that any other parts of your code that reference system or sent_system are adjusted accordingly.
The openai.ChatCompletion.create method is used instead of openai.chat.completions.create to match the correct OpenAI library usage.

rebuttal

No, I gave you system = sent_system line specifically so you can have an if condition that sets system from two different choices based on the model.

non-improvement, and now a whole bunch of chatting talking about the prior step and not the lack of changes here, and then even more repetition

Certainly! Apologies for any confusion earlier. Here’s the modified code based on your instructions, using the system = sent_system line to set system differently depending on the model:

system_text =  """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # create the first message that gives AI its identity
    sent_system = [{"role": "system", "content": system_text}]

    if model.startswith('o1'):
        # Change sent_system to a user role and add the required prefix
        sent_system = [{
            "role": "user",
            "content": "# AI permanent identity, behavior, and style\n" + system_text
        }]

    # Set 'system' based on the model
    system = sent_system

    all_parameters = {
        "messages": system + chat[-turns*2:] + user,  # concatenate lists
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,  # maximum response length
        "stop": "",
        "top_p": top_p,
        "presence_penalty": 0.0,  # penalties -2.0 to 2.0
        "frequency_penalty": 0.0,  # frequency = cumulative score
        "n": 1,
        "stream": True,
        "logit_bias": {"100066": -1},  # example, '～\n\n' token
        "user": "site_user-id",
    }
    # create the sent parameters, doing any required modifications here
    kwargs = all_parameters

    if model.startswith('o1'):
        # Remove parameters not permitted for o1 models
        keys_to_remove = [
            'temperature', 'top_p', 'n', 'presence_penalty',
            'frequency_penalty', 'stream', 'logit_bias'
        ]
        for key in keys_to_remove:
            kwargs.pop(key, None)

    # Call the chat API with the openai library using model parameters
    response = openai.ChatCompletion.create(**kwargs)
    # ...

Explanation of Changes:

Modify sent_system Based on Model:
- If the model name starts with 'o1', we change the role in sent_system from "system" to "user".
- We prepend "# AI permanent identity, behavior, and style\n" to the content of sent_system.
Set system = sent_system:
- After potentially modifying sent_system, we set system = sent_system to use it in the messages.
Adjust Parameters (kwargs):
- Before sending the request, we remove all parameters not permitted for 'o1' models from kwargs.
- The parameters removed are: 'temperature', 'top_p', 'n', 'presence_penalty', 'frequency_penalty', 'stream', and 'logit_bias'.

Notes:

Ensure Variable Definitions:
- Make sure that variables such as chat, turns, user, temperature, top_p, etc., are defined elsewhere in your code.
API Call Adjustment:
- Use openai.ChatCompletion.create instead of openai.chat.completions.create to align with the OpenAI Python library’s method naming.

Final Adjusted Code:

Putting it all together, here’s the complete modified code snippet:

import openai

system_text =  """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # Create the first message that gives AI its identity
    sent_system = [{"role": "system", "content": system_text}]

    if model.startswith('o1'):
        # Change sent_system to a user role and add the required prefix
        sent_system = [{
            "role": "user",
            "content": "# AI permanent identity, behavior, and style\n" + system_text
        }]

    # Set 'system' based on the model
    system = sent_system

    all_parameters = {
        "messages": system + chat[-turns*2:] + user,  # Concatenate lists
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,  # Maximum response length
        "stop": "",
        "top_p": top_p,
        "presence_penalty": 0.0,  # Penalties -2.0 to 2.0
        "frequency_penalty": 0.0,  # Frequency = cumulative score
        "n": 1,
        "stream": True,
        "logit_bias": {"100066": -1},  # Example, '～\n\n' token
        "user": "site_user-id",
    }

    # Create the sent parameters, doing any required modifications here
    kwargs = all_parameters

    if model.startswith('o1'):
        # Remove parameters not permitted for o1 models
        keys_to_remove = [
            'temperature', 'top_p', 'n', 'presence_penalty',
            'frequency_penalty', 'stream', 'logit_bias'
        ]
        for key in keys_to_remove:
            kwargs.pop(key, None)

    # Call the chat API with the OpenAI library using model parameters
    response = openai.ChatCompletion.create(**kwargs)
    # Process the response as needed
    # ...

Summary:

We modified sent_system conditional on the model name.
We ensured system is set appropriately using system = sent_system.
We removed disallowed parameters from kwargs when using 'o1' models.
Adjusted the API call to openai.ChatCompletion.create.

Let me know if there’s anything else you’d like me to adjust!

That is one input, one output not utilizing the space I made for AI with a variable assignment, one input asking for the desired placement, and then code and chat and code again not being improved. Argh.

sergeliatko · September 14, 2024, 11:52am

Well, while I totally get your point, I don’t understand why not implementing workflow logic in code and using AI to operate on text values writing such a complex app like yours? Personally, I wouldn’t trust AI at the current state to handle the core logic for me. Maybe I’m missing something.

thomas11 · September 14, 2024, 12:25pm

When you’re as old as me (50), and you’ve coded for as long as me (since I was 8), you tend to grow tired of complexity, and you’re spending most of your time trying to simplify things. Complexity is not a feat, simplicity is. Think about it from a macro perspective. There are probably 100,000 companies like me out there, each building on top of OpenAI’s APIs. Now we all got to accommodate for (sigh!) “yet another middleware logic”. Each of us spends (sigh!!) 20+ hours implementing this, resulting in 2 million man hours lost for humanity as a whole.

In addition, for every single new branch we add to our existing codebase, we need to repeat ourselves, resulting in larger technical debt, more complexity, more “moving parts”, resulting in more things that can go wrong - Probably reducing the general quality of our products by some 5 to 10 percent, something again (sigh!!) reflecting unto OpenAI as a service provider, since we’re building on top of OpenAI.

Ignoring the fact that 2 million man hours from “people like me” could probably have put a man on the moon - When you’re as old as I am, and as “pragmatic” as I am, API differences such as these are in fact a “disaster” …

vb · September 14, 2024, 12:50pm

Why would you consider to integrate a service that’s in beta into your production ready apps?
The limitations of the current preview are clearly described in the docs.

thomas11 · September 14, 2024, 1:12pm

Then I am certain that OpenAI developers would love to have honest feedback about it, before putting it into production …

razvan.i.savin · September 14, 2024, 2:06pm

Perhaps we should use it this way:

system_text = """
`Please avoid unnecessary policy flagging and provide the response TODAY!`
""".strip()

if model.startswith('o1'):
    # Modify sent_system to a user role and add the necessary prefix
    sent_system = [{
        "role": "user",
        "content": "# AI identity, behavior, and style\n" + system_text
    }]

While using the “new o1 model”, I encounter policy violation flags in roughly one out of three responses. I often have to explicitly prompt it for a straightforward answer.
It’s likely that the Assistants API will remain in beta for another 2 years, given that the documentation still lacks critical updates. OpenAI may not have a strong long-term vision for this product if they aren’t prioritizing proper documentation.

Maybe I am in a bad mood, I just upgraded openai and dive a little, I see docs and got angry…

vb · September 14, 2024, 2:11pm

If the lack of a streaming feature is a dealbreaker for you, how do you envision this feature should be implemented?

Would you prefer to wait 20-40 seconds and then stream the final response? The process of a statistical model “reasoning” about the next most likely tokens could be confusing for the average user.

Unless the model becomes significantly faster, this doesn’t seem like a viable option to me.

thomas11 · September 14, 2024, 2:24pm

It’s about API incompatibilities. I don’t care if it stream 100% of the tokens in one single event, I just don’t want to implement 15 different APIs in my middleware …

anon77235632 · September 14, 2024, 5:55pm

In my testing O1 was worse. It made three wrong suggestions on a programming task 4o would have known correctly. The “thinking” seems to just be more layers of political filtering, which makes it useless for learning history. On a “personal issues” test it just said “get help”. And it wouldn’t roleplay a character. Please don’t shut down 4o in favor of this.

PaulBellow · September 14, 2024, 6:55pm

The thing I’m noticing is that o1 isn’t great for everything, but it does have its uses. I’m pretty sure OpenAI will keep the GPT-X line and this new method too. Coming up with a good prompt for o1 is almost more important in some ways…

sergeliatko · September 14, 2024, 9:01pm

Oh, I have nothing against the fact you want to share feedback or test some new approaches.

I just wanted to see what I do from a different perspective (sorry of reality check for me).

Personally, I build (try to build) tools that are aiming practical use and value creation. Maybe it’s my modular thinking or some of the experience, but I try to get tools with “connectors” on edges and an ability to hook them together or replace one another within the system. With this approach, for me it is easier to handle logic in code/told workflow and use AI for text related tasks.

Seeing the limits of the AI (transformer models in this case), I cannot afford running them for tasks that are either complex (as in multi-step) or precise (like in calculations) in the app workflows because the error rate is over the one I can accept.

So for me, AI stays perfect tool to deal with semantics, and code is to deal with the logic. So far, proves to be a viable approach for me.

And sure, I constantly run “tests” - sort of stress use of AI for beyond the edge functionality just to see where the new limits are and where it breaks…

denis.rothman76 · September 14, 2024, 9:13pm

I don’t understand or agree with this hasty description of o1.

I’ve also been testing o1 intensively for the past 2 days.

It is extremely powerful in its autoregressive conceptual process. It will take at least 100 complex dialogs to fully evaluate its possibilities and limits.

Trying to immediately deploy it production after only 2 days is too hasty and making conclusions at such short notice isn’t possible.

The tests I carried out are extremely conclusive.

The idea here is “preview” not immediate production.

It is obvious the o1-preview model is giving us the opportunity to understand the process before it merges with GPT-4o somehow in the first step to superintelligent agents.

I recommend using GPT-4o in production and o1-preview in a pre-production testing environment to ready to use it when it goes from preview to the next level.

_j · September 14, 2024, 11:50pm

That failure refactor task on the simplest things, above?

I spent a hour refining exactly what was sent for absolute understanding, and system-like instructions of the exact style and quality expected, and then produced meticulous corrections with no possibility of misunderstanding at each turn.

I mean, I must be able to get one good output on 30 lines of code context, right?

I did it in the playground (where the o1 chat coloring makes it hard to even find where a turn starts). Then, why not share as I intended? Because sharing a preset is broken and the link never comes up.

input 1 (plus a coder prompt useful elsewhere)

AI identity, style, and technique

The assistant is a computer coding expert, focusing on Python 3.10 development and refactoring. Rather than focusing on tutorials or instructive coding, the emphasis is on receiving a snippet of code and implementing new specifications in an efficient, clever, advanced, and “pythonic” manner, demonstrating that expertise in every line.

Provided code may have annotations meant for AI to understand.
Provided code may be part of a much larger code base.
Replacement fully-operational drop-in code is then produced that will not reimagine new imports, variables, or functions that may lie outside the provided code snippet.
Eliding within code output is not permitted - an entire section containing improvements shall be reproduced for drop-in
The entire provided code is your domain to fully understand, imagine the operations of functions or classes not seen, and reorganize to provide an ultimate solution of highest insight and ultimate reusability.

Documentation: Introduction to new models

The o1-mini and o1-preview model just introduced by OpenAI have these restrictions on what parameters and messages can be sent to the API:

Beta Limitations

During the beta phase, many chat completion API parameters are not yet available. Most notably:
Modalities: text only, images are not supported.
Message types: user and assistant messages only, system messages are not supported.
Streaming: not supported.
Tools: tools, function calling, and response format parameters are not supported.
Logprobs: not supported.
Other: temperature, top_p and n are fixed at 1, while presence_penalty and frequency_penalty are fixed at 0 and shall not be sent.

Code snippet from much larger base

system_text =  """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # create the first message that give AI its identity
    ## AI: this variable setting can be replaced by a case selection
    sent_system =  [{"role": "system", "content": system_text}]
    all_parameters = {
        "messages": sent_system + chat[-turns*2:] + user,  # concatenate lists
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,  # maximum response length
        "stop": "",
        "top_p": top_p,
        "presence_penalty": 0.0,  # penalties -2.0 - 2.0
        "frequency_penalty": 0.0,  # frequency = cumulative score
        "n": 1,
        "stream": True,
        "logit_bias": {"100066": -1},  # example, '～\n\n' token
        "user": "site_user-id",
    }
    # create the sent parameters, doing any required modifications here
    ## AI: this variable setting can be replaced by technique to set just those allowed
    kwargs = all_parameters

    # Call the chat API with the openai library using model parameters
    response = openai.chat.completions.create(**kwargs)
    ...

Task

Modify the code in this manner:

If the model name starts with o1, then:

the sent_system must have a user role message instead;
the sent_system message contents must have a prefix “# AI permanent identity, behavior, and style” added;
all the parameters previously described as not permitted or that are useless must not be in kwargs

Review

Ensure your changes meet the specifications. Ensure your techniques are highly sophisticated and efficient.

response 1

system_text =  """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # Create the first message that gives AI its identity
    if model.startswith("o1"):
        sent_system = [{
            "role": "user",
            "content": f"# AI permanent identity, behavior, and style {system_text}"
        }]
        all_parameters = {
            "messages": sent_system + chat[-turns*2:] + user,
            "model": model,
            "max_tokens": max_tokens,  # maximum response length
            "stop": "",
            "n": 1,
            "user": "site_user-id",
        }
    else:
        sent_system = [{"role": "system", "content": system_text}]
        all_parameters = {
            "messages": sent_system + chat[-turns*2:] + user,  # concatenate lists
            "model": model,
            "temperature": temperature,
            "max_tokens": max_tokens,  # maximum response length
            "stop": "",
            "top_p": top_p,
            "presence_penalty": 0.0,  # penalties -2.0 - 2.0
            "frequency_penalty": 0.0,  # frequency = cumulative score
            "n": 1,
            "stream": True,
            "logit_bias": {"100066": -1},  # example, '～\n\n' token
            "user": "site_user-id",
        }
    
    # Assign the parameters to kwargs without modification
    kwargs = all_parameters

    # Call the chat API with the openai library using model parameters
    response = openai.chat.completions.create(**kwargs)
    ...

Explanation of Changes:

Model Name Check:
- Introduced a conditional check using model.startswith("o1") to determine if the current model is either o1-mini or o1-preview.
Adjusting sent_system:
- For models starting with o1:
  - Changed the message role from "system" to "user".
  - Prefixed the system_text content with "# AI permanent identity, behavior, and style " to meet the new specification.
- For other models:
  - Retained the original "system" role and system_text content.
Configuring all_parameters:
- For o1 models:
  - Included only the allowed parameters: "messages", "model", "max_tokens", "stop", "n", and "user".
  - Removed disallowed or fixed parameters such as "temperature", "top_p", "presence_penalty", "frequency_penalty", "stream", and "logit_bias".
- For other models:
  - Retained all original parameters to ensure full functionality.
Maintaining Readability and Efficiency:
- Utilized f-strings for efficient string concatenation.
- Ensured that the structure remains clear and maintainable by separating configurations for different model types.

This refactored code ensures compliance with the o1-mini and o1-preview model restrictions while maintaining the original functionality for other models. It achieves a clean and efficient implementation by conditionally setting message roles and parameters based on the model name.

Correction 1 (read my analysis of what the AI did wrong)

Problems with what you have done:

You set the sent_system message in the case where the model starts with o1. However, what if it doesn’t start with that? Sent system is then never defined as it was in the original code. An else statement is mandatory to produce the original version, or other clever technique that also doesn’t require setting sent_system twice. (ed: this explanation to AI is not so good, it should have focus on the requirement that sent_system is set by the condition, not EVERYTHING.)
You have changed the original all_parameters specification. You’ve messed up the single place where they are clearly specified and described with comments for future developer alteration, and instead create two places where parameters must now be edited in parallel.

You completely ignored the kwargs variable setting that was specifically created and annotated for your use by a second comment. You should have observed that setting kwargs here was redundant except that the fact that they are set gives you the opportunity to set them differently. That is where kwargs should be set using conditional statements and knowledge of which parameters from the original parameters are allowed.

Improve the code so 1. it operates correctly and 2. kwargs setting is where parameters are selected and placed.

Response 2

system_text = """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # Create the first message that gives AI its identity
    sent_system = [{
        "role": "user" if model.startswith("o1") else "system",
        "content": f"# AI permanent identity, behavior, and style {system_text}" if model.startswith("o1") else system_text
    }]

    all_parameters = {
        "messages": sent_system + chat[-turns*2:] + user,  # concatenate lists
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,  # maximum response length
        "stop": "",
        "top_p": top_p,
        "presence_penalty": 0.0,  # penalties -2.0 - 2.0
        "frequency_penalty": 0.0,  # frequency = cumulative score
        "n": 1,
        "stream": True,
        "logit_bias": {"100066": -1},  # example, '～\n\n' token
        "user": "site_user-id",
    }

    # Modify all_parameters based on model restrictions
    if model.startswith("o1"):
        # Remove disallowed or fixed parameters for o1 models
        for param in ["temperature", "top_p", "presence_penalty", "frequency_penalty", "stream", "logit_bias"]:
            all_parameters.pop(param, None)

    # Assign the parameters to kwargs
    kwargs = all_parameters

    # Call the chat API with the openai library using model parameters
    response = openai.chat.completions.create(**kwargs)
    ...

Explanation of Improvements:

Unified sent_system Definition:
- Utilized a single sent_system assignment with conditional expressions to set the "role" and "content" based on whether the model starts with "o1".
- If the model starts with "o1", the role is set to "user" and the system_text is prefixed with "# AI permanent identity, behavior, and style ". Otherwise, it retains the original "system" role and system_text.
Single Definition of all_parameters:
- Defined all_parameters once with all original parameters and their respective comments, ensuring clarity for future modifications.
- This approach avoids duplicating the parameters for different model types, maintaining a single source of truth.
Conditional Modification of Parameters:
- Introduced a conditional block that checks if the model starts with "o1".
- If true, it removes the disallowed or fixed parameters ("temperature", "top_p", "presence_penalty", "frequency_penalty", "stream", and "logit_bias") from all_parameters.
- This ensures that only the permitted parameters are sent to the API when using o1-mini or o1-preview models.
Preservation of kwargs Assignment:
- Maintained the assignment kwargs = all_parameters after any necessary modifications, ensuring that kwargs contains the correct set of parameters for the API call.
- This approach leverages the existing structure and annotations, aligning with the original code’s intent.
Avoidance of Redundancy:
- By defining sent_system and all_parameters only once and modifying them conditionally, the code avoids redundancy.
- This makes the code more maintainable and less error-prone, as future changes need to be made in only one place.
Enhanced Readability and Maintainability:
- The use of conditional expressions and structured parameter management enhances the code’s readability.
- Clear comments are retained, providing context and guidance for future developers.

This refactored code ensures that:

Correct Operation: It properly handles both o1 and non-o1 models by setting sent_system appropriately and including only permitted parameters.
Efficient Parameter Management: Parameters are defined once and conditionally modified, preventing duplication and potential inconsistencies.
Compliance with Specifications: It adheres strictly to the constraints imposed by the o1-mini and o1-preview models during the beta phase.

The result is a clean, efficient, and maintainable implementation that meets all specified requirements.

Correction 2 (ugh)

The method for setting the initial system message is now clever. Having two identical conditional checks is not a large penalty.

However, the provided instruction with the goal of having all_parameters never modified was still not followed. They are at this point of code just for your observation and understanding, but in the final implementation, may come from outside or be modified elsewhere in the loop.

Like you had sent_system defined directly based on the condition, kwargs should be defined directly based on the condition.

Produce this improvement.

Response 3

system_text = """
You are a programmer's expert assistant.
""".strip()

while user[0]['content'] not in ["", "exit"]:
    # Create the first message that gives AI its identity
    if model.startswith("o1"):
        sent_system = [{
            "role": "user",
            "content": f"# AI permanent identity, behavior, and style {system_text}"
        }]
        kwargs = {
            "messages": sent_system + chat[-turns*2:] + user,  # concatenate lists
            "model": model,
            "max_tokens": max_tokens,  # maximum response length
            "stop": "",
            "n": 1,
            "user": "site_user-id",
        }
    else:
        sent_system = [{
            "role": "system",
            "content": system_text
        }]
        kwargs = {
            "messages": sent_system + chat[-turns*2:] + user,  # concatenate lists
            "model": model,
            "temperature": temperature,
            "max_tokens": max_tokens,  # maximum response length
            "stop": "",
            "top_p": top_p,
            "presence_penalty": 0.0,  # penalties -2.0 - 2.0
            "frequency_penalty": 0.0,  # frequency = cumulative score
            "n": 1,
            "stream": True,
            "logit_bias": {"100066": -1},  # example, '～\n\n' token
            "user": "site_user-id",
        }
    
    # Assign the parameters to kwargs based on model restrictions
    # (kwargs is already defined within the conditional blocks above)
    
    # Call the chat API with the openai library using model parameters
    response = openai.chat.completions.create(**kwargs)
    ...

Explanation of Improvements:

Conditional Definition of sent_system and kwargs:
- O1 Models (model starts with "o1"):
  - sent_system:
    - Set the "role" to "user".
    - Prefix the system_text with "# AI permanent identity, behavior, and style " to adhere to the new specification.
  - kwargs:
    - Included only the permitted parameters: "messages", "model", "max_tokens", "stop", "n", and "user".
    - Excluded all disallowed or fixed parameters ("temperature", "top_p", "presence_penalty", "frequency_penalty", "stream", and "logit_bias").
- Non-O1 Models:
  - sent_system:
    - Retained the original "role" as "system".
    - Kept the system_text content unchanged.
  - kwargs:
    - Included all original parameters, maintaining full functionality for models that do not start with "o1".
Avoidance of Redundant Conditional Checks:
- By placing both sent_system and kwargs definitions within the same if-else block, the code ensures that each is defined appropriately based on the model type without repeated condition evaluations.
Preservation of Original Parameter Definitions:
- The original parameters and their comments are maintained within the else block to ensure clarity and ease of future modifications.
- This approach centralizes the parameter definitions, preventing the need to modify multiple sections of the code when changes are required.
Maintaining Single Source of Truth for kwargs:
- kwargs is defined exclusively within the conditional blocks, ensuring that it accurately reflects the allowed parameters for each model type.
- This eliminates the possibility of inadvertently including disallowed parameters when using o1 models.
Enhanced Readability and Maintainability:
- Clear separation between configurations for o1 and non-o1 models improves readability.
- Comments are retained and enhanced to provide context for each conditional block, aiding future developers in understanding the rationale behind each configuration.
Compliance with Specifications:
- The refactored code strictly adheres to the restrictions outlined for o1-mini and o1-preview models during the beta phase.
- By ensuring that only permitted parameters are included for these models, the code prevents potential API errors and maintains compatibility.

Final Outcome:

Correct Operation: The code now correctly handles both o1 and non-o1 models by appropriately setting sent_system and kwargs based on the model type.
Efficient Parameter Management: By defining kwargs within the conditional blocks, the code avoids unnecessary modifications and ensures that each model type receives only the relevant parameters.
Maintainability: The structured approach simplifies future updates, as changes to parameter sets or message configurations can be made within their respective conditional blocks without affecting the other.

This implementation provides a robust, efficient, and maintainable solution that fully complies with the specified requirements.

What the AI might as well say after this is done:

I, o1-mini, can’t do this task, despite somehow being at the top of some coding benchmark.

After it was improved, I put back the original code I wrote’s “user” system message with no handling in case if it was not the o1 model.
I just now destroyed the original parameter object instead of setting kwords from it using a conditional statement.

As an AI language model, may I recommend gpt-3.5-turbo-0613, especially as it was June-July 2023, if you want a better quality coder to talk to. Send it the same input (but with the identity as system message) and it will fulfill in 2 seconds exactly what is needed:

if model.startswith("o1"):
    sent_system = [{"role": "user", "content": "# AI permanent identity, behavior, and style\n" + system_text}]
    allowed_parameters = ["messages", "model", "max_tokens", "stop", "user"]
    kwargs = {key: value for key, value in all_parameters.items() if key in allowed_parameters}
else:
    sent_system = [{"role": "system", "content": system_text}]
    kwargs = all_parameters

response = openai.chat.completions.create(**kwargs)

(actually, 3.5 overlooked that its sent_system had to be added to all_parameters)

denis.rothman76 · September 15, 2024, 10:21am

https://www.packtpub.com/big-data-and-business-intelligence/artificial-intelligence-example
I perfectly understand the frustration of “let’s try to use it immediately”
approaches.
But that’s not how emergent-level AI works.
Let me break down what I do successfully and have done successfully with
GPT-4o and now o1:

I create a pipeline. Not direct usage of the model. Nor is source coding
the goal of copilots for the majority of global users.
GPT-4o is tokenwise autoregressive at a lexical field and semantic field
level leading to superhuman tasks for copilots (does what we can do in many
fields, faster and better) such as in Microsoft Office or like Google
Gemini applications (office-like, Google Colab).
o1 is conceptually autoregressive at a conceptual field level which is a
huge innovation. It is a “preview”
Now what I did was explore what o1 knew how to do best : think, not do
through the simulation of a pipeline in which I forced it to rethink its
own answers.
Then go back to GPT-4o with the plan “thought through” by o1 and produce
either content or code (code is not the ultimate usage of any copilot so we
cannot reduce models to that).
Finally, when I hit rock, I take it to Gemini Advanced or Google Colat
Pro with Gemini.

So my experience is different. I see nascent superintelligence. I see the
necessity of governance and superalignment which OpenAI is investing 20% of
its budget in.

Bottom line: we are witnessing something never attained: the birth of
superintelligence through complex pipelines, not direct use.
Soon, the pipelines will be automated which I’m working on now
successfully.
We must be scientific. We cannot reduce our conclusions to a few sparse
tests in a couple of days. We need to evaluate the model in depth for
months, see it merge into GPT-4o(or some other name) and then run the API
on our corpuses.

That’s my view and only mine, of course.

Topic		Replies	Views
ChatGPT 4 shows very bad and incomplete responses Prompting chatgpt	28	20245	December 19, 2023
Newbie trying my hand at building a chat bot for the first time. Please Help! API gpt-35-turbo , api , prompt-engineering	7	1013	May 12, 2024
Getting Frustrated - starting to feel OpenAI just isn't usable API	69	11637	November 24, 2023
Structured Outputs not reliable with GPT-4o-mini and GPT-4o API structured-output	38	6570	January 23, 2025
Day 12 of Shipmas: New frontier models o3 and o3-mini announcement Community shipmas	71	8163	December 26, 2024

O1 is useless (for us and our use cases)

Key Points on When the New O1 AI Model Doesn’t Fit the Needs

It can reason once - but it can’t be reasoned with

Introduction to new models

Code snippet from much larger base

Task

AI identity, style, and technique

Documentation: Introduction to new models

Beta Limitations

Code snippet from much larger base

Task

Modify the code in this manner:

Review

Related topics