Assistants seems to struggle citing multiple sources with Retrieval

evankozliner · November 26, 2023, 3:45am

Hey everyone

I’m testing out assistants for an app idea I had, but I’m struggling to have it cite multiple sources even when prompted. Instead it always seems to cite a single source, multiple times.

I’ve been testing with some copy/pasted articles off the internet.

I’ve uploaded 2 files, both about a recent story involving Derek Chauvin.

Instructions: You are journalist gpt. You summarize news into single paragraph briefs and cite your sources.

Model: gpt-4-1106-preview

Retrieval (check) - 2 articles (I could copy the text if that is helpful but I uploaded them both and explictly hit “save” on the UI)

Prompt: “Summarize recent articles about Derek Chauvin. Cite multiple sources if possible”

Have others had any luck getting gpt to cite multiple sources? Any model

Thanks!

Update: I peaked into the API response via the logs and I do see them both in there actually - i guess it just doesn’t always print them out!

Foxalabs · November 26, 2023, 3:54am

Hi and welcome to the Developer Forum!

Could you post a code snippet of your API calls and any setup/post processing code you have?

jr.2509 · November 26, 2023, 7:10am

Hi Evan - Just a thought from a non-technical perspective if still relevant: I would also expand your instructions and clarify that you expect the assistant to draw on multiple files for the task, you may even want to explicitly reference the file names in the instructions. I know you do this with the prompt already but the instructions can have quite a material impact on how your assistant executes tasks.

evankozliner · November 26, 2023, 5:12pm

Oh I was just using the playground - no code involved yet.

evankozliner · November 26, 2023, 5:13pm

Nice, I will clarify that. It actually does seem to draw on multiple files, it’s just not clear where in the files it’s drawing from. Will post another question soon haha

paul.fishwick · November 26, 2023, 5:20pm

If by “multiple sources”, you mean for the bot to cite sources not present in the two uploaded files, you might try accessing the web from the chat session. The LLM by itself is not likely to provide any sources. I also tried Tavily, and this does a good job by using web search + GPT summarization.

evankozliner · November 26, 2023, 5:25pm

Neat I’ll try Tavily! I posted a new question titled [Overcoming many small files using Assistants Retrieval ] (can’t post links here)

paul.fishwick · November 26, 2023, 5:57pm

Try this - it worked for me. Requires the OPENAI KEY and a TAVILY KEY (free for up to 1000 requests): How to build an OpenAI Assistant with Internet browsing | by Assaf Elovic | Nov, 2023 | Medium

Topic		Replies	Views
Overcoming many small files using Assistants Retrieval API assistants	2	1607	November 26, 2023
Assistants with both "retrieval" and "function" API assistants-api	23	7648	February 27, 2024
How do I force the assistant to read all the content in the file being used for retrieval API api , rag , assistants-api	1	3570	December 5, 2023
Mapping assistants API annotations back to the location in the source file API assistants , assistants-api	5	2942	September 20, 2024
New "Assistants" API a potential replacement for low level "RAG" style content generation? API	9	8604	March 4, 2024

Assistants seems to struggle citing multiple sources with Retrieval

Related topics