Batching with ChatCompletion not possible like it was in Completion

Hey,
Following the March 1st release of ChatGPT API, I’m willing to perform batching like explained in OpenAI API (Example with batching).

As it currently seems there is no option to do so.
I am aiming at sending multiple prompts and receiving multiple answers, 1 for each prompt as if it was on separate conversations.

e.g.

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {'role': 'user', 'content': 'this is prompt 1'},
        {'role': 'user', 'content': 'this is prompt 2'},
    ]
)

I get back:

...
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Sorry, there is no context or information provided for either prompt 1 or prompt 2. Can you please provide more information?",
        "role": "assistant"
      }
    }
  ]
...

Although I would like ChatGPT to look at every message separately and not as part of the same conversation.

Thanks

6 Likes

HI @yotam.martin

What’s stopping you from making multiple async calls to the chat completion endpoint for respective prompts/conversations?

1 Like

Hi @sps
If we look at the Chat limits (Pay-as-you-go users (after 48 hours)):
•3,500 RPM
•90,000 TPM*

In case of prompts that are shorter than 90,000 / 3,500 it makes sense to batch prompts together to reach the limits cap.

3 Likes

Batching won’t be feasible given how the chat completion endpoint works.

It would still eat away the TPM limit and affect the quality and length of the conversations that have been batched in a single call.

6 Likes

Yes @sps, but OpenAI has made the hilariously bad product decision to steer all of its GPT users towards the ChatCompletion API.

At 1/10th the price, all GPT users should and will be writing their own best attempts at utility classes to get around ChatCompletion’s ugly abstraction. But batching (and n) warrant a better approach than “dispatch many requests asynchronously”

4 Likes

Hi @yotam.martin, Thanks for pointing this out. I’m suffering from the same issue at the moment. And here’s the workaround I discovered, which I hope will be helpful to you. My problem is using text completion to auto-answer hundred thousands of responses (sentences or paragraphs) to a specific code. like, does this response describe a ‘xxx’ or ‘yyy’.

  prompts=[
        {'this is prompt 1'},
        {'this is prompt 2'},
    ]
)

to make it work on chapGPT, i adjust to:

  model="gpt-3.5-turbo",
  max_tokens=256,
  messages=[
        {'role': 'user', 'content': 'here is the background, and what i want to achive'},
        {'role': 'user', 'content': 'here are the xxxreponses list:
              1:sentence1.
              2:sentence2.
              3:sentence1.
              4:sentence2.
              5:sentence1.
              ....
              100:sentence2.'},
       {'role': 'user', 'content': 'Please determine whether each sentence relates to xxx. Your response should take relevant details from the background, the response, and the label. The output should only contain the sentence index number and the short answer yes or no.'}
    ]
)

The output is:

1. Yes
2. No
3. No
4. No
...
100. No

and that’s what i want! :upside_down_face:

6 Likes

Brilliant! I’ve been running classification task like this one by one. This is very helpful.

Is there really no clear drop-in replacement for batching in ChatCompletion?

2 Likes

I have tried using threading in python. Sometimes it produces the results quickly, other times it stalls indefinitely.

Necro’ing this - looks like batching is not currently supported for chat endpoints? Doesn’t sound like anyone’s figured out a step that we’re missing?

I can’t really imagine a technical barrier to it if using it as a substitute for the original Completion, so seems really weird. Still hoping someone comes up with a clean way to do it.

UPDATE: @yotam.martin here’s how you can do batching with ChatCompletion

5 Likes

@sps This seems like more of a workaround than a solution. Giving ChatGPT a prompt like: “Complete the following list of prompts and reply with a list of outputs” is very different than sending these as independent requests. I expect, for example, that if I sent a batch of stories in this format, each hundreds of words, and asked ChatGPT to complete them all, the stories would bleed together. This doesn’t seem very different from simply including these prompts as separate messages in a chat history.

This is a big deal for researchers like myself trying to study ChatGPT and GPT-4, and I expect for people building applications as well. If different prompts/completions contaminate each other, that affects my analysis of model behavior. For small experiments, I call the completions API with hundreds of unique, but relatively short prompts, and thousands of prompts with bigger experiments. It appears that I must call the ChatCompletions API hundreds or thousands of times, for what takes a single API call and 5-10 seconds in the Completions API. Maybe there are some tricks that I haven’t figured out, but in my initial testing, it looks like ChatCompletions takes longer than that to respond to a single prompt.

A huge +1 from me for batching to be added to the ChatCompletionsAPI.

1 Like

Yes, it absolutely is a workaround. I have mentioned possible limitations as well in the end.

Could you using reliableGPT for this - python package to handle batch calls to openai.