Is there a way to prevent the gpt-3.5-turbo API from returning content in chunks?

I have been testing the gpt-3.5-turbo API and my cURL requests all come back in chunks, with each work of the response content in a different chunk. Is there a way to have it all come in at once? Thanks

I’m not sure I understand. Is it not returning this, as the docs describe?

{
 'id': 'chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve',
 'object': 'chat.completion',
 'created': 1677649420,
 'model': 'gpt-3.5-turbo',
 'usage': {'prompt_tokens': 56, 'completion_tokens': 31, 'total_tokens': 87},
 'choices': [
   {
    'message': {
      'role': 'assistant',
      'content': 'The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers.'},
    'finish_reason': 'stop',
    'index': 0
   }
  ]
}

Edit: Looks like the stream option is it:

You need to check if the streaming setting is set to true

If you want it all at once, set the value to false

1 Like

Thanks, I’ve looked everywhere but can’t find the streaming option. Do you have any pointers where it is? Thanks!!

Reference:

OpenAI API: Chat, create

2 Likes