Batching with ChatCompletion Endpoint

sps · April 3, 2023, 6:29pm

Explanation

The promptsArray contains all the prompts that will be processed in a batch, as individual elements of a string array.
The promptsArray is then converted to a string using JSON.stringify() and stored in stringifiedPromptsArray, which will be used as the content of the user’s message.
batchInstruction is a system message that directs the chat completion model to complete every prompt in stringifiedPromptsArray and return an array of completions.
The chat completion is obtained from the response and converted back into an array of strings using json.loads(). The individual completions can then be easily accessed from batchCompletion

Output:

['Hello world, from', 'How are you B', 'I am fine. W', 'The  fifth planet from the Sun is ']
ChatGPT: 
['Hello world, from Earth', 'How are you Bob', 'I am fine. What about you?', 'The fifth planet from the Sun is Jupiter']

Limitations

The max_tokens doesn’t control the max_tokens for individual prompts; instead max_tokens limits the total amount of tokens per request.
Length of one completion can influence other completions in the batch. If one completion is longer than expected, other completions may get truncated, even the array might not turn out to be valid.

Topic		Replies	Views
Batching with ChatCompletion not possible like it was in Completion API	17	21071	December 13, 2023
Batching prompts still being recommended despite not Documentation api	5	1648	January 23, 2024
Batch requests with Chat Completion API chatgpt	1	3383	November 29, 2023
Parallelise calls to the API - is it possible and how? API	13	34100	December 13, 2023
Efficient Processing of Multiple Complex Prompts with GPT-4 API gpt-4 , api , beginner-help	13	11455	November 7, 2023

Batching with ChatCompletion Endpoint

Explanation

Output:

Limitations

Related Topics