Add a thread truncation strategy in the assistant streaming SDK Node

AurizonIA · April 26, 2024, 8:01am

Hello, I’m developing a web app that uses the assistant stream v2 with the Node SDK and I need to add a thread truncation strategy by adding max_prompt_tokens and max_completion_tokens to efficiently manage the cost of tokens in my threads.
I’ve developed a cloud function that I call to run the thread I’ve already created, but I can’t figure out how to add truncation to my code:

    const run = openai.beta.threads.runs.createAndStream(threadId, {
      assistant_id: assistantId,
    });

    for await (const event of run) {
      console.log("Event received:", event.event);

      if (event.event === "thread.message.delta") {
        console.log("Event thread.message.delta received :", event.data);
        const chunk = event.data.delta.content?.[0];
  ...

Do I need to add :
max_completion_tokens?: number | null;
max_prompt_tokens?: number | null;

After “assistant_id: assistantId,” or add them when I import my message into the thread I create beforehand and pass it as the “threadId” parameter in my cloud function?

Thank you for your help

Topic		Replies	Views
Add smarter controls to truncate Thread chat history (Assistant API, Runs API) API threads , assistants-api	0	786	June 28, 2024
Thread length = more context tokens? API assistants-api	3	215	July 14, 2024
Token consumption: Prompt tokens exponentially increase when using Threads (Assistants) API assistants-api	8	577	September 5, 2024
OpenAI Assistant maximum token per Thread API gpt-4-turbo	11	11372	May 28, 2024
Assistant max_completion_tokens not working as expected API assistants-api	4	2269	April 29, 2024

Add a thread truncation strategy in the assistant streaming SDK Node

Related topics