Biggest pains with LLM agents (Assistants API, Autogen, etc)

Hey Champs

I am curious what you consider the biggest pain when using LLM agent frameworks like Assistants API, Autogen, Langchain, SuperAGI, etc?

1 Like

For me itā€™s understanding how to implement it. Iā€™m not from a developer background and I do my best, but this stuff isnā€™t exactly user-friendly.

Tools like GPT-4 are able to help me get things working, but to do anything advanced requires a bit more coding knowledge and experience.

1 Like

Interesting. So youā€™d like some kind of visual interface to create agents, right?

1 Like

That certainly makes things easier, specifically the RAG side of things. I can call the model no problem, but I havenā€™t even attempted to do any RAG or vector database due to the perceived difficulty.

1 Like

Going a bit offtopic, but you are correct in considering RAG difficult. The funny thing is that one may get a feeling of itā€™s simplicity after watching some popular YouTube videos, but they only show the very basics. Real applications are much more complex because retrieval is generally still not fully solved problem.

3 Likes

@TonyAIChamp

Speaking from a no code/low code background;

  • Every time I see the terms low/no code I cringe because what that means to a dev and what it means to an actual low/no code person is very different

  • Assistants in Playground is easy enough to use. Putting it into production on a web is the hard part. Charging $40/month for 5 bots is taking the mickey

  • LLMs donā€™t seem to do any real reasoning and just remix training data. Most of the code related material prior to training cut-off dates was written by devs so when a low code/no code person decides to play, its often a painful decision as the training data includes little to nothing from low/no code types on how to get stuff done.

  • Langchainā€¦ painful mainly because last time I used it was Oct/Nov and back then Bard/GPT4 hadnā€™t been trained on it so didnā€™t offer much in the way of working solutions. My solution was rather than figure out what a blob is, to rather just copy/paste the underlying libraries into a python script and eventually things worked. This was for basic stuff like the course shows how to transcribe a youtube video so hey, let me transcribe a local Teams recording. At one point uploading the recording to YouTube and transcribing was the less painful option but I think I eventually figured out how to get it to do local files as well.

  • You mention visual interfaces. I first used a webpage builder circa 2005 and I quickly learnt that Microsoftā€™s UI/UX is good enough that most people can figure out a lot of Word/Excel/Powerpoint by just clicking, seeing what happens and learning by doing. Many UX/UI experiences are like the webpage builders of old

  • Even ā€œno codeā€ sources like Marketplace in Google Cloud can be problematic. Take Stable Diffusion. Marketplace had 2 main ā€œreposā€. One was for Automatic1111 and I forget the name of the other. Deploymenty is easy enoughā€¦clickā€¦clickā€¦clickā€¦hold onā€¦ there arenā€™t any low-end graphics cards available and the one click install is written for only one card type. Eventually I worked out that if I download the ā€œrepoā€ I could probably edit it and include a list of cards rather than just the one type (T4s I think).
    The other repo deloys via a VM and the env was low and clunky enough that I moved on to something else. In the end was easier to run SD locally and wait 10-30 mins for my images, lol.

1 Like

Thank you for a detailed answer, Tim and welcome to the forum!

Could you elaborate?

I agree, but thatā€™s the thing. It is not meant yet for production. It is still in beta.

To my understanding one of the main things agents were made to solve is the fact, that LLM generally gives better answers after delving deeper into the question. That is why weā€™ve seen chain-of-though working so well. And the agents were basically automation for things like chain of thought. Now we can see the limitations of this approach, but at the time they proved to show great improvement in communication with the models (though highly inefficient in many cases).

We are steering away a bit from the topic, but overall to your point of no-code/low-code. I personally donā€™t believe in such solutions early on in any field. I mean, they may be created, but they wonā€™t be as popular and widespread as say no-code website builders now, simply because there is not yet enough expertise on the market to move to that stage.

@curt.kennedy @elmstedt @bruce.dambrosio @SomebodySysop @cass what are your thoughts?

I would love to chime in on the conversation, but unfortunately, I donā€™t use any of those tools. Iā€™m sticking with Chat Completion API in a RAG architecture and developing my own solutions in PHP. I am creating knowledge base applications comprised of thousands of documents in a variety of sizes, hierarchal structures and semantic variances. In my humble opinion, none of the tools mentioned can address the myriad issues which arise in trying to get coherent, consistent, comprehensive and in particular, inexpensive responses from LLMs.

But, Iā€™m posting here because I too would like to know what the experience of others has been so far.

2 Likes

Prediction is hard. Let a thousand flowers bloom.
Personally, I view LLMs as interesting language processors to build language-based Turing-machines around, and I build everything else up around them from scratch in python. But Iā€™m a researcher. If you are trying to build an application, and an existing framework at whatever level of abstraction fits it, why not? E.G., LangChain is ā€˜less codeā€™ for many straightforward document processing apps.

In general, IMHO, religious wars are silly. If you like a tool, use it. If you think your community ā€˜ownsā€™ a term (e.g. ā€˜no codeā€™), well, ok, sure, Iā€™ll happily admit my context for usage may be different than yours. ā€¦

Perhaps more interesting is a core claim made earlier:
If one is doing bleeding edge anything, an LLM wonā€™t know about it and so will be useless to help.
YUP! Bingo.
But IMHO the next gen will include real-time 24/7 ingest of new information (if not via incremental training, then via real-time updating of next-gen RAG knowledge-bases, (see my project Owl). So maybe not today, but soon, bots will know more, and be more up-to-date, than you/me.

Now what?

2 Likes

You are correct. That is what I personally see with current agent frameworks - they are good for some tasks, but not for anything that requires meticulous control.

1 Like

I donā€™t use agents or frameworks either.

The last one I peeked at was BabyAGI. I just took that code, simplified it, and called it CurtGPT :rofl:

The big reason for not using agents is I donā€™t think I need them. I can already do whatever I want just coding things myself. If they have a feature I want, and I have no idea how to do it, I look at their code and make my own version.

So agents are for understanding implementation, or understanding what tools might be useful for me to build for my own use case.

2 Likes

Assistant API is crazy token expensive. i rather made my own framework on wix. (YES REALLY) where i use wix DB and embedding model to assing vectors for my own VD. session memory for conversational memory array. CHEAPER FASTER SCOOTER :slight_smile:

1 Like

so . . . are you looking to build your own, I guess?

the main problem with the agents is that they just donā€™t work. The hype is way over the top compared to what these things actually achieve.

Try it, you will see for yourself.

1 Like

Hi and welcome to the forum Ales! Agree on the unpredictability of the cost with the agents. But I guess with the current approach used in the frameworks it is inevitable.

Hey George

Welcome to the forum :slight_smile:

I used various agent frameworks quite a lot. Iā€™m kinda developing something in that realm, but Iā€™m actually exploring an anti-agent approach. Anyways, thatā€™s not the point of the topic.

As I myself faced quite a few important issues with the existing agent frameworks I was curious what others are thinking.

I wouldnā€™t say they donā€™t work as such claim wonā€™t mean much to me. They work like magic for some things (personal assistants in some tasks), but are definitely poor in many others.

I think it is inevitable for OpenAI to start losing their clients from the developer side since the token cost of all your models is so expensive that it prevents them from being used in upscaled projectsā€¦ My plan like a lot of others is to just build frameworks using your models and then rather train our own on LAMA.
BAD BAD HUNGRY OPENAI

I have an opinion on that, but donā€™t want to go off-topic :wink:

I say it againā€¦ itā€™s very expensive for upscaled projects and it has no control of vectorizing content and feeding it back to AIā€¦ my assistant framework is much much token cheaper with flawless results even though Iā€™m more a graphic designer than a programmer. Therefore I guess that something stinks here considering a billion dollars company with those brains releases something like their Assistant API after so long and made it so WOW on dev day while itā€™s not. :money_mouth_face:

Can your assistant fully substitute a role that is 40k a year. I very much doubt this at this moment. In the near future - yea. But not atm. Though I may be wrong not know some specific roles that could be substituted :slight_smile: