Optimize assistant costs that work for users of my business through WhatsApp

edisonlopezec · February 10, 2024, 3:51pm

I am creating an open AI assistant that answers questions from my business users through WhatsApp. I am using gpt4-turbo which is supposed to have lower costs than gpt4 but I am experiencing slightly high costs (and I am only in testing) I don’t know why this is, I am managing a wizard solely by its id, the independent threads For each user I save them in Firebase and obtain them or create them if they do not exist, I have a waiting queue for a single message to arrive at the chatbot at a time. How can I reduce these costs, the “runs” must be closed, from what I understand a thread and a run close automatically, but I don’t know if doing it manually saves costs, if so I would like an example of how to close them once the chatbot has responded, I would like to optimize costs, I have the fewest possible functions for the bot to work correctly.

jlvanhulst · February 10, 2024, 3:55pm

the backend now has a way to view threads as well - so if you enable that, you can see the each thread and the cost (tokens in and out) per thread. I find that very helpful.

anon10827405 · February 10, 2024, 4:34pm

Do you mean the view all threads? Where can this be found?

Reduce the size of instructions
Reduce the file sizes
Reduce the function calling parameters
Downgrade to GPT-3.5

You don’t “manually close” threads. I believe there is a misconception happening here regarding the Assistants framework.

I also operate a Whatsapp Chatbot. One thing that works for me is (if I understand you correctly)

This is a great solution. People in chat apps tend to send batches of messages like

“Hmmm…idk”
“I guess maybe this”
“Or that”

So great move. I have done the same (waiting a specified period and then grouping it all together.

Other then that, and the already mentioned, there’s not much left to do.

edisonlopezec · February 10, 2024, 7:12pm

It would be possible for you to help me check if my code is well structured, this is the part where the questions arrive after the queue. I found another problem that if in one chat I write to him, hello this is my name “Steve”, and in another chat with a totally different thread, sometimes he knows my name in this case “Steve” which was said in another chat with another thread.


const openai = new OpenAI({
  apiKey: process.env.APIOPENAI,
});

async function getAssistantResponse(userQuestion, thread, phoneNumber) {
  global.scrapeCourseDetails = scrapeCourseDetails;
  global.getThemes = getThemes;
  global.searchCourses = searchCourses;
  global.relatedCourse = relatedCourse;
  global.priceCourse = priceCourse;
  global.flagUserAsCallRequired = flagUserAsCallRequired;
  global.flagKanbanInterest = flagKanbanInterest;
  global.flagKanbanDiscussion = flagKanbanDiscussion;
  global.flagKanbanDesicion = flagKanbanDesicion;
  global.conversionCurrency = conversionCurrency;
  global.requiresIntervention = requiresIntervention;

  await openai.beta.threads.messages.create(thread, {
    role: "user",
    content: userQuestion,
  });

  const run = await openai.beta.threads.runs.create(thread, {
    assistant_id: "id del asistente",
  });

  let runStatus = await openai.beta.threads.runs.retrieve(thread, run.id);

  console.log("ID ASISTENTE" + assistant.id);

  while (runStatus.status !== "completed") {
    await new Promise((resolve) => setTimeout(resolve, 2000));
    runStatus = await openai.beta.threads.runs.retrieve(thread, run.id);

    while (runStatus.status === "in_progress") {
      console.log("Esperando respuesta");
      await new Promise((resolve) => setTimeout(resolve, 2000));
      runStatus = await openai.beta.threads.runs.retrieve(thread, run.id);
    }

    if (runStatus.status === "requires_action") {
      console.log(
        "Funcion requerida: " +
          JSON.stringify(
            runStatus.required_action.submit_tool_outputs.tool_calls[0].function
              .name
          )
      );

      const toolCalls =
        runStatus.required_action.submit_tool_outputs.tool_calls;
      const toolOutputs = [];

      for (const toolCall of toolCalls) {
        const functionName = toolCall.function.name;

        const args = JSON.parse(toolCall.function.arguments);
        // Si la funcion es flagKanbanInterest, se le pasa el phoneNumber
        if (
          functionName === "flagKanbanInterest" ||
          "flagKanbanDiscussion" ||
          "flagUserAsCallRequired" ||
          "requiresIntervention"
        ) {
          args.phoneNumber = phoneNumber;
        }

        const output = await global[functionName].apply(null, [args]);

        toolOutputs.push({
          tool_call_id: toolCall.id,
          output: output,
        });
      }

      await openai.beta.threads.runs.submitToolOutputs(thread, run.id, {
        tool_outputs: toolOutputs,
      });
      continue;
    }
  }

  const messages = await openai.beta.threads.messages.list(thread);
  const lastMessageForRun = messages.data
    .filter(
      (message) => message.run_id === run.id && message.role === "assistant"
    )
    .pop();

  return lastMessageForRun.content[0].text.value;
}

async function bot(question, thread, phoneNumber) {
  try {
    const userQuestion = question;
    const response = await getAssistantResponse(
      userQuestion,
      thread,
      phoneNumber
    );
    return {
      response: response,
      availableBot: true,
    };
  } catch (error) {
    console.error(error);
  }
}
// prueba bot
module.exports = bot;

edisonlopezec · February 10, 2024, 7:13pm

It would be possible for you to help me check if my code is well structured, this is the part where the questions arrive after the queue. I found another problem that if in one chat I write to him, hello this is my name “Steve”, and in another chat with a totally different thread, sometimes he knows my name in this case “Steve” which was said in another chat with another thread.


const openai = new OpenAI({
  apiKey: process.env.APIOPENAI,
});

async function getAssistantResponse(userQuestion, thread, phoneNumber) {
  global.scrapeCourseDetails = scrapeCourseDetails;
  global.getThemes = getThemes;
  global.searchCourses = searchCourses;
  global.relatedCourse = relatedCourse;
  global.priceCourse = priceCourse;
  global.flagUserAsCallRequired = flagUserAsCallRequired;
  global.flagKanbanInterest = flagKanbanInterest;
  global.flagKanbanDiscussion = flagKanbanDiscussion;
  global.flagKanbanDesicion = flagKanbanDesicion;
  global.conversionCurrency = conversionCurrency;
  global.requiresIntervention = requiresIntervention;

  await openai.beta.threads.messages.create(thread, {
    role: "user",
    content: userQuestion,
  });

  const run = await openai.beta.threads.runs.create(thread, {
    assistant_id: "id del asistente",
  });

  let runStatus = await openai.beta.threads.runs.retrieve(thread, run.id);

  console.log("ID ASISTENTE" + assistant.id);

  while (runStatus.status !== "completed") {
    await new Promise((resolve) => setTimeout(resolve, 2000));
    runStatus = await openai.beta.threads.runs.retrieve(thread, run.id);

    while (runStatus.status === "in_progress") {
      console.log("Esperando respuesta");
      await new Promise((resolve) => setTimeout(resolve, 2000));
      runStatus = await openai.beta.threads.runs.retrieve(thread, run.id);
    }

    if (runStatus.status === "requires_action") {
      console.log(
        "Funcion requerida: " +
          JSON.stringify(
            runStatus.required_action.submit_tool_outputs.tool_calls[0].function
              .name
          )
      );

      const toolCalls =
        runStatus.required_action.submit_tool_outputs.tool_calls;
      const toolOutputs = [];

      for (const toolCall of toolCalls) {
        const functionName = toolCall.function.name;

        const args = JSON.parse(toolCall.function.arguments);
        // Si la funcion es flagKanbanInterest, se le pasa el phoneNumber
        if (
          functionName === "flagKanbanInterest" ||
          "flagKanbanDiscussion" ||
          "flagUserAsCallRequired" ||
          "requiresIntervention"
        ) {
          args.phoneNumber = phoneNumber;
        }

        const output = await global[functionName].apply(null, [args]);

        toolOutputs.push({
          tool_call_id: toolCall.id,
          output: output,
        });
      }

      await openai.beta.threads.runs.submitToolOutputs(thread, run.id, {
        tool_outputs: toolOutputs,
      });
      continue;
    }
  }

  const messages = await openai.beta.threads.messages.list(thread);
  const lastMessageForRun = messages.data
    .filter(
      (message) => message.run_id === run.id && message.role === "assistant"
    )
    .pop();

  return lastMessageForRun.content[0].text.value;
}

async function bot(question, thread, phoneNumber) {
  try {
    const userQuestion = question;
    const response = await getAssistantResponse(
      userQuestion,
      thread,
      phoneNumber
    );
    return {
      response: response,
      availableBot: true,
    };
  } catch (error) {
    console.error(error);
  }
}
// prueba bot
module.exports = bot;

anon10827405 · February 10, 2024, 7:43pm

What are you using to host this?

A huge red flag for me is the global variables. Depending on what host you’re using this could leave behind artifacts and cause a huge pile-up of strange issues.

Ideally you want your function to not have side-effects caused by other functions/variables.

Based on your code it doesn’t really seem to be the issue but considering that I cannot see how question is created I just figured that’s where I would start.

For troubleshooting you can go to the playground and view the thread to see where the Assistant learned the name “Steve”.

https://platform.openai.com/playground?assistant={assistant_id}&mode=assistant&thread={thread_id}

jlvanhulst · February 10, 2024, 10:21pm

go to settings → Organization
under Threads make sure to pick the right option.

Then they will show in the main side bar where Assistants are also shown!

jlvanhulst · February 10, 2024, 10:41pm

How do create your threads? That would be my guess where this ‘knowledge’ leak happens, using the same thread twice. Once you start looking in the threads in the backend you will solve that problem easily I think.

Oh and your function call check should be

if (functionName === “flagKanbanInterest” || functionName === “flagKanbanDiscussion” || functionName === “flagUserAsCallRequired” || functionName === “requiresIntervention”)

(But you could also drop the whole check as well and simply always inject the phone number :)

anon10827405 · February 10, 2024, 10:42pm

Thanks for sharing.

Although it says that it’s available for API I cannot see any documentation on it.

What code did you use?

jlvanhulst · February 10, 2024, 10:44pm

This is in platform.openai.com where you have the Assistants as well. Once you go to Settings and enable threads visibility you can see them from the main sidebar menu

anon10827405 · February 10, 2024, 11:00pm

Yeah I see that I just was wondering if you knew how to call it via API. Thanks for showing though. Did not know

jlvanhulst · February 10, 2024, 11:16pm

I don;t think there is a ‘list threads’ options at the moment. You can only retrieve a know thread. BUT - now that you can do it in the backend I’m sure it won’t be long before it will show up in the API?

Topic		Replies	Views
Access message from threads and assistant API	16	20333	December 20, 2023
Questions about Assistant, threads API gpt-4 , assistants , assistants-api , assistants-pricing	29	38015	July 18, 2024
Issues with Assistant API + threads API api	3	4540	December 22, 2023
How exactly do you get charged for using the API for assistants? API assistants-api	33	7672	November 27, 2023
Create one my Assistants and my Threads, how do I use it? API	3	6135	November 13, 2023

Optimize assistant costs that work for users of my business through WhatsApp

Related topics