OpenAI thread run API incredibly slow

srijans · August 4, 2024, 10:26pm

I am using the OpenAI thread run API, and it takes anywhere from 20-45 seconds to get the response. In the playground, when I run the same prompt in a assistant of identical config, it takes a maximum of 5 seconds to generate the full response.

Assistant config:

{
        model: 'gpt-4-turbo',
        temperature: 0.8,
        top_p: 0.2,
      }

Thread run method snippet:

  const run = await openai.beta.threads.runs.createAndPoll(thread.id, { assistant_id: assistant.id, max_prompt_tokens: maxPromptTokens })

  if (run.last_error) {
    throw new Error(run.last_error.message)
  }

  const messages = await openai.beta.threads.messages.list(thread.id, { run_id: run.id })
  const tokenUsageStats = run.usage

  const latestMessage = messages.data.pop()

  if (latestMessage.content[0].type === 'text') {
    const { text } = latestMessage.content[0]
    const { annotations } = text
    const citations = []

    let index = 0
    for (let annotation of annotations) {
      if (ignoreCitations) {
        text.value = text.value.replace(annotation.text, '')
      } else {
        const { file_citation } = annotation
        if (file_citation) {
          const citedFile = await openai.files.retrieve(file_citation.file_id)
          citations.push(`[${index}]${citedFile.filename}`)
        }
        index++
      }
    }

    return { response: text.value, tokenUsageStats, threadRun: run }

I have also tried different models like gpt-3.5-turbo/gpt-4o but still it seems to take that long.

Will interacting with the API always take this long even though executing it through the playground takes significantly less?

jochenschultz · August 4, 2024, 11:14pm

traceroute

might be your friend…

jochenschultz · August 4, 2024, 11:31pm

Also maybe it would be good to update the SDK by giving it a keepAlive: true option to fight latency?

@srijans

could you try to connect this way:

const axios = require('axios');
const https = require('https');

const agent = new https.Agent({
  keepAlive: true
});

const instance = axios.create({
  httpsAgent: agent
});

const run = await instance.post('https://api.openai.com/v1/beta/threads/runs', {
  thread_id: thread.id,
  assistant_id: assistant.id,
  max_prompt_tokens: maxPromptTokens
}, {
  headers: {
    'Authorization': `Bearer YOUR_API_KEY`,
    'Content-Type': 'application/json'
  }
});

if (run.data.last_error) {
  throw new Error(run.data.last_error.message);
}

const messagesResponse = await instance.get(`https://api.openai.com/v1/beta/threads/${thread.id}/messages`, {
  params: { run_id: run.data.id },
  headers: {
    'Authorization': `Bearer YOUR_API_KEY`
  }
});

const messages = messagesResponse.data;
const tokenUsageStats = run.data.usage;

const latestMessage = messages.data.pop();

if (latestMessage.content[0].type === 'text') {
  const { text } = latestMessage.content[0];
  const { annotations } = text;
  const citations = [];

  let index = 0;
  for (let annotation of annotations) {
    if (ignoreCitations) {
      text.value = text.value.replace(annotation.text, '');
    } else {
      const { file_citation } = annotation;
      if (file_citation) {
        const citedFileResponse = await instance.get(`https://api.openai.com/v1/files/${file_citation.file_id}`, {
          headers: {
            'Authorization': `Bearer YOUR_API_KEY`
          }
        });
        const citedFile = citedFileResponse.data;
        citations.push(`[${index}]${citedFile.filename}`);
      }
      index++;
    }
  }

  return { response: text.value, tokenUsageStats, threadRun: run.data };
}

Where do you live btw.? I found that connecting from my local machine in germany is exceptionally slower than from a server located in usa (>10 seconds).

Maybe you can try by upping a cloud instance over there and try again.

Also using VPN might be an option. Could be something your government is doing e.g. they might want to know what you are doing with AI – am I getting too paranoid?

Topic		Replies	Views
Extremely long request times- Completions API gpt-4o Bugs gpt-4o	10	534	December 5, 2024
Unstable speed of gpt-3.5-turbo-16k API api , gpt-35-turbo-16k , performance	6	1111	January 9, 2024
Runs randomly take > 30sec Bugs assistants-api	7	726	September 11, 2024
Very slow response time with chatgpt-3.5 turbo model API API	17	11034	December 19, 2023
Assistant API takes long to respond Bugs gpt-4 , api	12	3646	August 27, 2024

OpenAI thread run API incredibly slow

Related topics