Why does my assistant find the right answer from file on Playground but not via API?

fyodor · November 10, 2023, 5:35am

I uploaded a file to the assitant. Tried on playground and got the right answer. Tried the same question via API and it doesn’t find the answer. The fileid was passed with the message.

Also why do 10 steps when we had 1 step before with chat completions? Is there really a need to create a thread, add a message to it, run the thread, retrieve the run, list the messages of the thread, instead of simply calling an assistant like you would a chat completion?

This seems like a spectacular amount of conceptual and practical overhead for something that could be incredibly simple.

I came to Assistants to move away from Pinecone and doing my own embeddings but so far Assistants seem the more complicated process to use.

jochenschultz · November 10, 2023, 5:52am

It’s called agents.
I guess stuff like chain of thought is buildin there.

You can imagine it like thinking. When I ask you how to get me some tomatoes from a supermarket you would probably answer with drive to the supermarket, buy them and come back.

But what you really want is:

check if you have money
dress yourself up
check if you got your keys…
etc…

Pinecone can be used in that process e.g. to identify subprocesses and then get the presafed knowledge graph on how to do that from a graph db.

Pure similarity search is not really what you need from an assistant.

fyodor · November 10, 2023, 6:04am

They should have called them agents, which is what literally everyone else calls them. I guess a wasted half-day is not too bad. Back to Pinecone and standard RAG for me.

If smart people didn’t have the natural tendency they have to turn a 1 step process into a 10 step process, and to make a complicated thing instead of a simple thing, we would be much further ahead. It feels like they got people from AWS to design the new API endpoints and architecture, whereas before we had simple, easily understandable and implementable calls.

Agents do not work well. They are immature. RAG works well, and there is no simple plug and play implemetation of RAG at developer pricing: somewhere were you upload your files, and simply call your LLM with the file attached and get an answer.

Do you reckon others need it? I don’t mean chatbase and so on. I mean a dev endpoint with dev endpoint pricing that does that. It maybe accepts a couple of optional params like how many chunks you want to provide to the LLM and the rest is automated, like pdf.ai for devs.

Thoughts?

jochenschultz · November 10, 2023, 6:15am

I mean it was developer day and you have to use the marketing train as long as it runs…

fyodor · November 10, 2023, 6:16am

It seems to have run 4 days for me.

Anyway, what have the romans ever done for us? https://www.youtube.com/watch?v=Qc7HmhrgTuQ

pondin6666 · November 10, 2023, 7:11am

In my experience, use assistant api and run retrieval did work.

But in my test case, the uploaded file_id was bind to assistant object, not passed in a thread’s one created new run.

Assistant seems to be a predefined service unit equipped with suitable tools, and Thread seems to be chat session, and they are not belong to each other, they are independant.

You can create sevaral assistants for different purpose, and more than one be used in a single thread.
Such as create run in existed thread with assistant A, append new user input, then create a new run with antoher assistant.
And of course, assistant can be used in different threads.

It’s conventient for there’s no need to handle chat history now, just keep use the same thread_id if it is a running chat session for a user, openai will take care of the session context for us.

I think openai might provide better syntax sugar in the future, such as thread.run_with_new_input_and_wait_complete_and_return_response or thread.run_with_new_input_and_return_streamed_object , or find some tools that do it for you.

If you don’t need the reusable nature of assistant api and auto managed chat session context of thread, you can use old completion api.

Of course new Assistant is far from mature and have some bugs now, such as csv and docx file not supported for retrieval, or when I uploaded a bit bigger json file, the retrieval failed to answer in time, and works well after I reduce the content amount significantly inside the file.
But anyway the new assistant api can do what playground did, and for those who didn’t invest too much on integration tools, new api is more convenient and simple.

That’s my experience, FYI.

dillonSP · December 8, 2023, 4:10am

Hey Fyodor, I share your belief in RAG! Since you’ve so clearly described what we’ve built, I’d like to point you towards Superpowered AI, an API for retrieval augmented generation (using OpenAI models).

Topic		Replies	Views
Did assistant api kill manual RAG with vector databases? API	8	6629	December 18, 2023
New "Assistants" API a potential replacement for low level "RAG" style content generation? API	9	8545	March 4, 2024
RAG with more than 10 files API assistants-api	9	4548	January 15, 2024
Assistant Retrieval method and RAG (are they doing same?) API codex , gpt-4 , gpt-35-turbo , chatgpt , api	7	7366	January 3, 2025
Important delay issues with Assistants Using Retrieval Augmentation API assistants	7	993	February 8, 2024

Why does my assistant find the right answer from file on Playground but not via API?

Related topics