Stupid Engineers! Decide to upgrade my production env!

Apparently my environment at work needed upgrading. So I apparently need to update our chat app. which is live and which they didn’t tell me until after they did it. I’m trying to find that answer to a problem that seems idiotically elusive.

I get you decided to make radical changes to OpenAI client (why I have no idea, and why its not backwards compatible is just sloppy work)

So my code now throws a “friendly” error telling me to visit some migration guide. I go to the github repo and read the README as suggested.

I need a simple answer … I don’t want to send the client a string of messages I want to send it my prompt as I have from day 1 … no where in the documentation does it have how to simply send a prompt!

This is my code. This is what has been working for almost a year: Why oh why!

    def _count_tokens(self, text):
        # Simple token count function
        return len(openai.Completion.create(
            engine="text-davinci-003",
            prompt=text,
            max_tokens=0
        )["choices"][0]["logprobs"]["tokens"])

    def compile_prompt(self, prompt):
        character_data = self.mongo_handler.get_character_data(self.characterID)
        system_message = self.mongo_handler.get_system_message(self.characterID)
        
        # Ensure system_message doesn't exceed max_system_prompt_tokens
        system_message_tokens = self._count_tokens(system_message)
        if system_message_tokens > self.max_system_prompt_tokens:
            raise ValueError(f"System message exceeds {self.max_system_prompt_tokens} tokens")

        message_history = self.mongo_handler.get_message_history(self.sessionID)
        
        # Limit the message history
        limited_message_history = message_history[-self.message_history_limit:]

        full_prompt = f"{system_message}\n\n"
        for message in limited_message_history:
            full_prompt += f"{message}\n"
        full_prompt += f"User: {prompt}"

        return full_prompt

    def invoke(self, prompt):
        full_prompt = self.compile_prompt(prompt)
        
        try:
            response = openai.Completion.create(
                engine=self.model,
                prompt=full_prompt,
                **self.optional_params
            )
            result = response.choices[0].text.strip()
            
            if not result:
                raise ValueError("Empty response from primary API")
        
        except (openai.error.OpenAIError, ValueError) as e:
            if self.fallback_api_key and self.fallback_api_endpoint:
                openai.api_key = self.fallback_api_key
                openai.api_base = self.fallback_api_endpoint

                try:
                    response = openai.Completion.create(
                        engine=self.fallback_model,
                        prompt=full_prompt,
                        **self.optional_params
                    )
                    result = response.choices[0].text.strip()
                except openai.error.OpenAIError as fallback_error:
                    raise RuntimeError(f"Both primary and fallback API calls failed: {fallback_error}")
            else:
                raise RuntimeError(f"Primary API call failed and no fallback API configured: {e}")

        self.mongo_handler.add_message_to_history(self.sessionID, "User", prompt)
        self.mongo_handler.add_message_to_history(self.sessionID, "AI", result)
        
        return {
            "response": result,
            "characterID": self.characterID,
            "sessionID": self.sessionID
        }

You can try using the GPT-3.5-turbo-instruct model which is the latest/greatest “Completion” style model. All GPT-4 family are ChatML or “Chat Completion”. Note that Completion is deemed “Legacy,” so 3.5 might be the last model in that format…

Hope this helps.

Hi,
Yeah the issue seems to be with the library itself. I send it a
prompt=full_prompt

prompt value and it returns a exception that say I can only add “Messages” which in my understanding are JSON objects. and my database contains strings. So I’m looking at this as I have to reformat my entire payload to be compatible with “messages” which seems … dumbfounding at best. they couldn’t have removed this ability could they?

chat.completions.create() is not backwards compatible with Completion.create() so where did this function go?

This is a bit old, but they break it down here…

Basically, instead of sending a single prompt, you send a System prompt and user / assistant pairs. It’s a bit more complicated, but it gives you a bit more control.

What version of the library are you using?

Do you have more code you can share around the prompt bit?

Also…

Moving from Completions to Chat Completions in the OpenAI API

How to get migrate from the legacy OpenAI Completions API to Chat Completions

Updated over a week ago

Chat Completions is the standard API to use with OpenAI’s latest models. You can learn about getting started with it using our text generation developer guide.

Prompts to Messages

To have a more interactive and dynamic conversation with our models, you can use messages in chat formate instead of the legacy prompt-style used with completions.

Here’s how it works:

  • Instead of sending a single string as your prompt, you send a list of messages as your input.

  • Each message in the list has two properties: role and content.

    • The ‘role’ can take one of three values: ‘system’, ‘user’ or the ‘assistant’

    • The ‘content’ contains the text of the message from the role.

  • The system instruction can give high level instructions for the conversation

  • The messages are processed in the order they appear in the list, and the assistant responds accordingly.

Even basic Completions requests can be completed through Chat Completions, as you can see below:

Then Now
'prompt' : 'tell me a joke' 'messages':

[{'role':'user', 'content':'tell me a joke'}]|

Now, it is easier than ever to have back-and-forths with the model by extending the list of messages in the conversation.

‘messages’: [{‘role’:‘user’, ‘content’:‘tell me a joke’},
{‘role’:‘assistant’, ‘content’:‘why did the chicken cross the road’},
{‘role’:‘user’, ‘content’:‘I don’t know, why did the chicken cross the road’}]

System Instructions

You can also use a system level instruction to guide the model’s behavior throughout the conversation. For example, using a system instruction and a message like this

‘messages’: [{‘role’:‘system’, ‘content’:‘You are an assistant that speaks like Shakespeare.’},
{‘role’:‘user’, ‘content’:‘tell me a joke’},

will result in something like

{…
‘message’: {‘role’:‘assistant’,
‘content’:‘Why did the chicken cross the road? To get to the other side, but verily, the other side was full of peril and danger, so it quickly returned from whence it came, forsooth!’}
…}

If you want to explore options that don’t involve hanging to manage the message conversation history yourself, check out the Assistants API.

https://help.openai.com/en/articles/7042661-moving-from-completions-to-chat-completions-in-the-openai-api

Well this is the first issue. See I compile my string prompt by a concat of many strings and so I can use this for the above token safety net.

    def compile_prompt(self, prompt):
        character_data = self.mongo_handler.get_character_data(self.characterID)
        system_message = self.mongo_handler.get_system_message(self.characterID)
        
        # Ensure system_message doesn't exceed max_system_prompt_tokens
        system_message_tokens = self._count_tokens(system_message)
        if system_message_tokens > self.max_system_prompt_tokens:
            raise ValueError(f"System message exceeds {self.max_system_prompt_tokens} tokens")

        message_history = self.mongo_handler.get_message_history(self.sessionID)
        
        # Limit the message history
        limited_message_history = message_history[-self.message_history_limit:]

        full_prompt = f"{system_message}\n\n"
        for message in limited_message_history:
            full_prompt += f"{message}\n"
        full_prompt += f"User: {prompt}"

        return full_prompt

Next this grabs strings … first the guidelines. then the message history and last it appends the users input.

So in my mongo DB handler pulls these strings.

def get_message_history(self, sessionID):
    session = self.sessions.find_one({"character.SessionID": sessionID})
    return session["messages"] if session else []

def add_message_to_history(self, sessionID, sender, message):
    self.sessions.update_one(
        {"character.SessionID": sessionID},
        {"$push": {"messages": {"sender": sender, "message": message}}},
        upsert=True
    )

So basically it’s a single document with many many strings. and it sounds like I would have to modify my DB to store objects instead (or create the objects on the fly)

1 Like

Yeah, I would just create the objects on the fly.

One system prompt for “instructions”

Then user/assistant pairs. The user content would be the user’s message, and the assistant content would be the LLM’s response.

It’s a bit more work, but you get to use the new GPT-4o, and it does work better in some instances.

Have we nudged you enough in the right direction?

I guess so. Just not the thing I wanted to do on a Sunday. this was all based on v0.28 of the client. I guess it seems more progressive. And I probably should have kept up and rolled an update. instead now we have a dead app site. with people screaming.

I can see the appeal as well… I’m guessing the payload costs more due to all the extra JSON stuffs.

I mean, as a short-term fix, try changing the model to GPT-3.5-turbo-instruct. That’ll get you up and running and give you time to make the changes to ChatML.

Or run the old library until you fix it on a dev server?

Weird they didn’t let you know they were updating the library! It’s been through quite a few changes.

Good luck!

You’re just charged for the tokens in the content part…

            engine="text-davinci-003",

Just change to gpt-3.5-turbo-instruct…

They sent an email a while ago about text-davinci-003 going away. Weird that model is working at all?

Was working until 6am AEST shrugs well I should check that … maybe it was falling back to my openrouter model interesting point you made. Maybe I wouldn’t have noticed if this update wasn’t pushed. Ironically a token count of 0 is less than the safety net. And if davinci was giving a bad response. it would have just tried the other model.

As for not telling me they did mention a “security update” but in my mind that wasn’t going to bring code killing changes! :wink:

1 Like