Playground doesn't work properly?!

Hi all, I have an Assistant in the OpenAI playground with Retrieval tool and 5 files linked as knowledge base

In the last days I tested it and the answers were always correct. it never had hallucinations or provided wrong answers, and I think this is the normality

Today I’m noticing very low quality outputs, sometimes it doesn’t find the info in the knowledge base that is well structured and not big
I tried to use GPT4 turbo and the output was the same for the same answers.

So I can say that in this moment it’s impossibile to trust an Assistant, if the quality of the outputs is different day by day.

If I build the code of the Assistant by myself it’s obv that the quality of the output depends by me, but if I’m testing the Assistant in Playground and it doesn’t retrieve the answers that are contained in the KB, I start to think that it’s not so efficient at the moment

What do you think about?

I had similar situation yesterday when using 3.5-1106 through playground. My instructions in the prompt were followed much worse than directly through API with the same parameters (at the same time).

It sounds like this may just be the model used. You can try non-preview models for a bit more reliability.

Assistants are your friend. We love assistants.

For me the problem in Playground is for Chat not Assistants.

And though I agree there are no agent frameworks that are production ready yet, this is an off topic :slight_smile:

Aha, but the topic was about assistants, so Uno reverse card on you!

:arrows_clockwise:

The preview models are in flux, but there’s no good explanation I can come up with for that difference unless they are deploying new revisions of the model slowly or experimentally.

1 Like

The Assistants are in Beta, so we could say that every bugs is possible and expected

@_j Thank you for your answer, I was thinking about that for days, and reading that confirm me that at the moment it’s impossible build a business around Assistant API. And I also had the thought that people who spend in API to essentialy test the Assistant is something that could be appreciated from OpenAI

Maybe it will be possible in weeks, who knows, but at the moment it’s not feasible

What do you suggest at the moment for a customer support bot?

I already built chatbots integrated to langchain with nice results (better than the Assistant at the moment), but it’s a not suitable solution for production for my case, because in that case building a good Knowledge base is crucial and I obtained it with endless series of FAQs to cover any potential question/word of the user.

If you, as an expert user, have some tips to suggest is highly appreciated :slight_smile:

For customer support, there’s a couple different categories an AI might answer from, which employ your company knowledge. Segmenting those to “sales” or “tech support” domains could give better expert AIs.

The assistant might have the ability to work with large uploads, where you don’t pay for some embedding upfront, but it doesn’t know your data and can’t be optimized for the quality you can offer in data pre-processing.

I’ll let you ponder another technique: indexed data, table of contents, pages, traditional search. Via functions.

Imagine a first-tier function that only provides a list of your knowledge categories. Very precise and has all sections an AI might explore. A site map.

Then another for lists of documents in that category. “Browse with Bing’s” first search step, but your documents and their descriptions and nothing omitted.

Then finally, one to retrieve documents, which can accept an array of documents to get at the same time.

If you’re familiar with 30 year old tech (like text mode chat…), Gopher (vs unstructured hypertext of the web) gave this kind of hierarchy.

This gives the AI chat history high quality nesting also, to inform the AI exactly what it is looking at. Seems much better than a smattering of context-less chunks that were injected into the AI as “relevant information”, when they might not be that relevant at all.

1 Like

hey @_j and @TonyAIChamp , hope you are good guys!

Did you notice any improvement from the performance of the OpenAI Assistant?
From the last message, at the end I built a custom rag solution with langchain and qdrant and it works good enough but naturally the concept of the OpenAI Assistant with the instruction included could be a game changer if it’s efficient

I was expecting this thread to be about the recent UI changes to Playground, which kinda break mobile… and now the Presets don’t even show up :frowning:

Anyone else notice this?
I sent this email to support@OpenAI.com today (16Mar2024)

The OpenAI Playground’s UI is now broken in mobile.

Playground is awesome, but RECENT CHANGES to USER INTERFACE are terrible in mobile! I now can’t see my Presets anymore (unless I pivot my phone to Landscape), and also a lot of the time scrolling vertically doesn’t work like it used to (depending on where the focus is – it now feels unpredictable, unlike before when it was perfectly intuitive). I love the Playground, and since Spring of 2023 I have been using it hundreds of times per week; please consider reverting some of the UI changes until it works in mobile the way it used to (aka PERFECTLY).

PS: hopefully it is just some CSS that can be tweaked so Portrait and Landscape work the same re. visibility of the Presets dropdown (because I select inside that dropdown A LOT, but rarely select inside the Chat dropdown!)

Thanks.

Playground has always been a bit of a bonus in my eyes (not officially!)

The way I see it, personally, I’d rather them work on models and API and keeping network up rather than flesh out tools like Playground. They spend a LOT more time on ChatGPT likely…

So, while we might see Playground changes in the future, I’m not sure it’s a top priority.

Have you thought about building your own version of the playground?

Is it a CSS issue on Mobile or just bad web design?

You’ve reached out to support, so that’s about the best you can do.