@nikunj This problem is still unaddressed, two months later. The Assistant is still returning source tags that have no corresponding annotations.
In the examples below, I have redacted (with [...redacted...]) parts of the content to protect my company’s confidential information, but you can use the thread_id and message_id to locate this data internally to help you debug.
In a recent case, the source tag 【7†source】 appeared in the output and the annotations field was empty, like this:
{
"assistant_id": "asst_o5CHcKj8uZSNg567sjRs2E6B",
"content": [
{
"text": {
"annotations": [],
"value": "[...redacted...] \u30107\u2020source\u3011."
},
"type": "text"
}
],
"created_at": 1704743170,
"file_ids": [],
"id": "msg_XREhks8a35w2OQ5UmMPZ1orn",
"metadata": {},
"object": "thread.message",
"role": "assistant",
"run_id": "run_wt8cYaz9IU3mdatBY1jfjnc2",
"thread_id": "thread_6ajZAX71o06kh6DUtsAS622x"
}
In another case, several source tags appeared in the output, but the annotations field only contained annotations for some of them and not others, like this. You can see that there are source tags 【9†source】【10†source】【11†source】【12†source】【13†source】in the content, but no annotation for 【13†source】. Also, two different annotations appear for 【10†source】, with the same quote but different start_index and end_index:
{
"assistant_id": "asst_o5CHcKj8uZSNg567sjRs2E6B",
"content": [
{
"text": {
"annotations": [
{
"end_index": 377,
"file_citation": {
"file_id": "file-WDCeoz9qzoqqYkudP4InRiNc",
"quote": "Simulating [...redacted...] code"
},
"start_index": 366,
"text": "\u301011\u2020source\u3011",
"type": "file_citation"
},
{
"end_index": 751,
"file_citation": {
"file_id": "file-WDCeoz9qzoqqYkudP4InRiNc",
"quote": "Introduction [...redacted...] problem"
},
"start_index": 740,
"text": "\u301010\u2020source\u3011",
"type": "file_citation"
},
{
"end_index": 1065,
"file_citation": {
"file_id": "file-WDCeoz9qzoqqYkudP4InRiNc",
"quote": "What's [...redacted...] n"
},
"start_index": 1054,
"text": "\u301012\u2020source\u3011",
"type": "file_citation"
},
{
"end_index": 1625,
"file_citation": {
"file_id": "file-WDCeoz9qzoqqYkudP4InRiNc",
"quote": "Post [...redacted...] option"
},
"start_index": 1615,
"text": "\u30109\u2020source\u3011",
"type": "file_citation"
},
{
"end_index": 2237,
"file_citation": {
"file_id": "file-WDCeoz9qzoqqYkudP4InRiNc",
"quote": "Introduction [...redacted...] problem"
},
"start_index": 2226,
"text": "\u301010\u2020source\u3011",
"type": "file_citation"
}
],
"value": "To test [...redacted...] campaigns\u301011\u2020source\u3011.\n\n [...redacted...] group\u301010\u2020source\u3011.\n\n [...redacted...] code\u301012\u2020source\u3011.\n\n [...redacted...] option\u301013\u2020source\u3011.\n\n [...redacted...] donations\u30109\u2020source\u3011. \n\n [...redacted...] done\u301010\u2020source\u3011. [...redacted...] methodologies."
},
"type": "text"
}
],
"created_at": 1704743379,
"file_ids": [],
"id": "msg_68nMqZJVUMPu6UD3Q0H0Wph4",
"metadata": {},
"object": "thread.message",
"role": "assistant",
"run_id": "run_7A0ICERIl9SWBca1OgbIDidy",
"thread_id": "thread_nrrXCqnFZEfJDxVngiuyelKV"
}
This looks pretty wrong to me. Please investigate this problem and let us know what you find. Thank you!