How can I retrieve the earliest input and output data based on the existing assistant and thread ID?

Hello everyone, could you please let me know how to retrieve the earliest input and output data based on the existing assistant and thread ID?

I previously created an assistant and entered numerous QA pairs, resulting in nearly 4 million tokens in the entire thread. I want to see my earliest input, but the playground displays “Additional items available through the API.”

Could you guide me on how to write an API to retrieve the earliest input and output results? Usually, it only shows User and Assistant.

Additionally, how is it possible for a single thread to handle over 4 million tokens? Isn’t GPT-4 limited to 128K tokens?

At what token total should a thread be cleared to maintain optimal performance?

Thank you.

Use order parameter set as asc.

  const threadMessages = await openai.beta.threads.messages.list(
    "thread_abc123",
    {
      order: "asc"
     }
  );

  console.log(threadMessages.data);

As for how it handles 4M tokens, that’s the advantage of Assistants API, it manages the context for you.

Thank you for your prompt response, and the solution was surprisingly simple and effective. However, I still have a few questions:

  1. Do I only need to provide the thread ID, and not the assistant ID or API key?
  2. Is the code you provided executable in Python, or do I need to install any additional packages? Also, what does the variable “asc” represent?
  3. This is the code I asked GPT-4 for separately. Could you let me know if it’s okay?
import openai

# 從 key.txt 文件中讀取 API 密鑰
with open('key.txt', 'r') as f:
    OPENAI_API_KEY = f.readline().strip()

# 初始化 OpenAI 客戶端
client = openai.OpenAI(api_key=OPENAI_API_KEY)

# 定義 assistant 和 thread ID
assistant_id = "your_assistant_id"  # 替換為你的 assistant ID
thread_id = "your_thread_id"        # 替換為你的 thread ID

def get_earliest_message(client, thread_id):
    # 從 thread 中檢索所有訊息
    messages = client.beta.threads.messages.list(thread_id=thread_id).data
    
    # 檢查是否存在訊息
    if not messages:
        print("No messages found in the thread.")
        return None
    
    # 獲取最早的訊息(列表中的最後一項)
    earliest_message = messages[-1]
    
    return earliest_message

def get_earliest_assistant_response(client, thread_id):
    # 檢索所有訊息
    messages = client.beta.threads.messages.list(thread_id=thread_id).data
    
    # 遍歷訊息列表,找到第一個 assistant 角色的訊息
    assistant_response = next((msg for msg in messages if msg['role'] == 'assistant'), None)
    
    return assistant_response

# 獲取最早的訊息
earliest_message = get_earliest_message(client, thread_id)

# 獲取最早的助手回應
earliest_assistant_response = get_earliest_assistant_response(client, thread_id)

# 將結果寫入文本文件
if earliest_message:
    with open('earliest_message.txt', 'w', encoding='utf-8') as f:
        f.write(f"Earliest message role: {earliest_message['role']}\n")
        f.write(f"Earliest message content: {earliest_message['content']}\n")
    print("Earliest message has been written to earliest_message.txt")

if earliest_assistant_response:
    with open('earliest_assistant_response.txt', 'w', encoding='utf-8') as f:
        f.write(f"Earliest assistant response:\n{earliest_assistant_response['content']}\n")
    print("Earliest assistant response has been written to earliest_assistant_response.txt")

Thank you.

  1. For your need, you only need the thread id. Threads are independent of assistants so you do not need the assistant id to retrieve the messages. You need the OpenAI API key when accessing any APIs.

  2. OpenAI has python library. You can use it and make sure it is updated. “asc” meaning ascending. By default, when you retrieve messages, the order is “desc” or descending which is why you get the new messages.

  3. I am not a python guy so I cannot verify it.