Its not a misconception that openai pagination is derived / based on the createdAt timestamp - this is directly stated in their api reference.
https://platform.openai.com/docs/api-reference/messages/listMessages
order
string
Optional
Defaults to desc
Sort order by the created_at
timestamp of the objects. asc
for ascending order and desc
for descending order.
You have alluded to the existence of a relational field that provides the internal order, which might very well exist - but the api reference does not expose this and it certainly isn’t provided to the consumer/client for client side sorting. It is a reasonable assumption to make that if the ListMessages endpoint returns a List of messages, someone might want to process or sort/order that list. If it was returned as a relational or linked list of some sort - that might be useful… but I don’t think it is. Its returned as a simple array.
I’ve confirmed that the ListMessages endpoint does ‘seem’ to return the elements in the right order in the following experiment, so maybe there is that internal sorting field on the server - however, because the only field we have for client side sorting is the createdAt field - there is nothing else available for use - as soon as someone does a sort - it does often result in out of sequence elements when 2 or more share the same timestamp. I’ve replicated it with this code.
Note - these are using the await operator which ensures the start and end of each api operation. This is tcp / http - not udp fire and forget - if the await operator does not correctly indicate the completion of an API endpoint call, then it would be chaos. The whole point of the await operator is to wait for the completion of the call. AddMessage is not the same as CreateRun - where it spawns an external process which we have to monitor via further calls. As soon as the method call is completed - the message is created in the thread and it has the final createdAt value and the final text content - we get the message id back from each call.
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
await openAI.AddMessageToThread(DateTime.Now.Ticks.ToString(), this.OAI_ThreadId);
var allMessagesNew = await this.GetMessages();
allMessagesNew.Data = allMessagesNew.Data.OrderBy(e => e.CreatedAt).ToList();
var first7 = allMessagesNew.Data.TakeLast(21);
foreach(var msg in first7)
{
Console.WriteLine($"{msg.CreatedAt} -> {msg.Content.First().Text.Value}");
}
Without
allMessagesNew.Data = allMessagesNew.Data.OrderBy(e => e.CreatedAt).ToList();
The messages are in sequence as soon as they are received by the client, however as soon as client-side code tries to do any form of meaningful sorting - we get out of sequence elements - in fact, once the messages are out of sequence, we have nothing we can orderby or sort by that could possibly get them back into the correct order - we would need to request them from the server again. Correct me if I am wrong - is there something the client can use to sort on besides CreatedAt ?
ChatGPT helped with the analysis of these to find out of sequence elements.
Basically any where ‘True’ is present - indicates it was out of sequence with the one that preceded it - arguably the one before it should be ‘true’ as well - but blame chatgpt for that.
,CreatedAt (Unix Seconds),Metadata Timestamp (Ticks),Out of Sequence
0,2025-02-16 00:48:55+00:00,2025-02-16 00:48:55.686698+00:00,False
1,2025-02-16 00:48:56+00:00,2025-02-16 00:48:56.488032+00:00,False
2,2025-02-16 00:48:56+00:00,2025-02-16 00:48:56.078016+00:00,True
3,2025-02-16 00:48:57+00:00,2025-02-16 00:48:57.715402+00:00,False
4,2025-02-16 00:48:57+00:00,2025-02-16 00:48:57.217054+00:00,True
5,2025-02-16 00:48:57+00:00,2025-02-16 00:48:56.896362+00:00,True
6,2025-02-16 00:48:58+00:00,2025-02-16 00:48:58.535435+00:00,False
7,2025-02-16 00:48:58+00:00,2025-02-16 00:48:58.124789+00:00,True
8,2025-02-16 00:49:00+00:00,2025-02-16 00:48:58.944913+00:00,False
10,2025-02-16 00:49:01+00:00,2025-02-16 00:49:01.300892+00:00,False
9,2025-02-16 00:49:01+00:00,2025-02-16 00:49:01.709664+00:00,False
11,2025-02-16 00:49:02+00:00,2025-02-16 00:49:02.426264+00:00,False
12,2025-02-16 00:49:02+00:00,2025-02-16 00:49:02.119255+00:00,True
13,2025-02-16 00:49:03+00:00,2025-02-16 00:49:03.559056+00:00,False
14,2025-02-16 00:49:03+00:00,2025-02-16 00:49:03.149332+00:00,True
15,2025-02-16 00:49:03+00:00,2025-02-16 00:49:02.835359+00:00,True
16,2025-02-16 00:49:04+00:00,2025-02-16 00:49:04.577891+00:00,False
17,2025-02-16 00:49:04+00:00,2025-02-16 00:49:04.269784+00:00,True
18,2025-02-16 00:49:04+00:00,2025-02-16 00:49:03.861926+00:00,True
19,2025-02-16 00:49:05+00:00,2025-02-16 00:49:05.193953+00:00,False
20,2025-02-16 00:49:05+00:00,2025-02-16 00:49:04.894189+00:00,True
I will concede - if I rely on the ordering provided by ListMessages - there is a good chance it will result in correct order so far. However I do need to do processing client side - which seems to be vulnerable to the lack of reliable indexing field. I suppose I could modify the message elements after they are received from the ListMessages endpoint, to preserve their index in the metadata - and then doing sorting on that metadata field. This is merely a work around.
You have asked why I am doing bulk additions to a thread.
For context - in my project - I need to provide contextual information to the assistant which helps to shape their response, but I do not want to store that persistently in the thread - because I don’t want it to contaminate the displayed text the user sees or expose it to the assistant in future runs. Eg: it contains miscellaneous context information the user shouldn’t see, but the assistant needs for ‘that’ run.
I call these ephemeral messages - I used to delete these messages after the run was completed. However it was risky to add these messages to the main thread because a system error or interruption could result in the messages being left inside the main thread and never cleaned out.
An example for an ephemeral message is to remind the assistant to not do certain things or to format the data in specific ways. These reminder messages are unsightly and do not need to be repeated multiple times- the place where they are most valuable is right at the end of the message chain - so we add it in at the end, run the thread, then delete the messages that we don’t care about. This is how I ‘used’ to do it.
The new approach I am taking is to spawn a completely different thread and provide a sub-set of the data from the main thread and then run on that thread - then just discard the thread at the end as a throw away thread - then transfer the information needed to the main thread. I add the ephemeral messages, but I am less concerned about clean up - because I am discarding the whole thread.
At the end of the run on the throw away thread, I usually extract two messages from it and then transplant them into the main thread.
The two messages I usually end up adding to the main thread end up being the ‘user’ message that the triggered the interaction (with some modifications), and the ‘response’ message that the assistant provided from the throw away thread.
If I add these to the main thread without an artificial delay - it results in a good chance for timestamp duplication. If these get the same createdAt timestamp, then any further calls I make to orderBy(e=> e.CreatedAt) - would run the risk of getting duplicate timestamp entries out of order.
So while it is not out of order when it comes from the server, its definitely a risk when sorting on the client - and there is nothing that the api exposes to provide reliable ordering on client side - unless I’ve missed that in the documentation somewhere.