I gave Bard a coding task today that I had long ago given up on. I mean I had given up on giving this type of task to Bard in favor of GPT-4 enabled ChatGPT.
It was very fast, spitting out answers far faster than ChatGPT.
It was wrong, made hallucination type errors, in this instance made up flags that don’t exist for a command line tool.
It could not correct from supplied error output.
I gave exact task to ChatGPT after, waited a bit longer, got slightly wrong solution. It fixed it from supplied errors. Nice working code.
So crank on GPT
Curious about others’ experience with Bard v. GPT?
I tried the vision model with a few cases where GPT-V had failed me and retrieved the same faulty answers from Bard. Admittedly, I did not spend extra time to improve on the prompts but, maybe I was relieved…
I did not have access to the newest models due to the UK and EU not having access, yet.
My experience is that Bard works for people who want a free minimal AI that lies a lot.
Just looking at the Gemini Technical Report, I’m not surprised you had a lackluster experience.
It seems they’re using Gemini “Pro” in Bard, which is (more-or-less) on par with GPT-3.5-Turbo.
Well without the coding task you used, I’m not sure there is a useful way to comment on your experience. Both GPT and Bard have rarely failed coding tasks I have given them. I suspect that is because I know precisely what I need done and I only ask for that specific thing.
As for Gemini, I was under the impression it would be rolling out December 13th.
I believe it is still coming out on December 13th* just that it was released on Bard early.
*Gemini - Google DeepMind
Btw, anyone knows if on 13th of December anyone will be able to get access or some preliminary actions required to get access?
I’m personally building a product with GPT but opened to use different providers. The real difference is Assistants. You don’t have assistants in Bard/Gemini, Claude/Anthropic.
I switched immediately from chat completions to assistants and now I want a team of AI agents that support me
There are many other agent frameworks you can use with other models (like AutoGen, SuperAGI, etc)
curious, what makes assistants a USP for you? I personally don’t really see the value add (for developers)
My 2 cents here: like any other agent framework today, though not production ready, they are incredibly helpful as personal assistants. For me personally they for example take some heavy lifting of prompt engineering (assessing results).
I agree that they could be a nice tool because they allow you to quickly manually test stuff, but I don’t see how they are a unique selling proposition that would motivate you to vendor lock yourself to OpenAI.
Particularly when there’s a significant chance that they’ll disappear again in a couple of months.
There are too many tools appearing in AI daily and it is very difficult to follow for many. So if you got yourself familiarised with something it is mentally difficult to move away. That would be my guess.
If I’m being real, it’s still worse.
The contextual reasoning with GPT-4 is still far more powerful to me imho. It’s not as good at identifying the right context with which to pull from if you ever start doing more than a single-shot prompt with it.
Also, the amount of false flags I get with Bard is fucking insane. The thing is, clearly the way it interprets the prompt is linked to the safety trigger. The safety trigger is also surface-layer unlike GPT-4’s which seems baked in at the full training level. Bard’s guardrails are too easy to pass, but the problem is, it puts guardrails up based on however Bard or its supplementary system interprets it in that moment. Which, half of the time, is wrong or unnecessary.
So what ends up happening? Bard becomes an excellent guardrail exploitation practice for itself.
Not only that, because it uses inferred feedback to some degree, you can easily help it get better at ignoring its own safety locks lol.
For a free service, sure, it gets the job done. However, at this point I think it’s just worse because of Google, not because of the model itself.
After taking the time to dig into the docs and the papers I have to scratch my previous comment regarding performance.
Bard with the Gemini Pro model is currently not available for users from UK and the EU.
So, I only re-learned that Bard with Palm-2 is not suitable for my use cases.
What does it answer to you if you ask " which model are you using?"?
here's what it answered to me
I am currently using a lightweight and optimized version of LaMDA, which is a research large language model (LLM) developed by Google AI. LaMDA itself is built upon a massive dataset of text and code, allowing it to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.
It’s important to note that I am constantly being updated and improved, so the specific model I use may evolve over time. However, the underlying principles of LaMDA and its ability to process and generate language will remain consistent.
I am using the Pathways Language Model 2 (PaLM 2),…
I was not aware of a LaMDA model but who knows if Bard knows what it is being built upon and would doubt it until proven wrong.
aren’t you using LaMDA?
You’re correct, I am using LaMDA as my base model. PaLM 2 is built upon the LaMDA architecture…
That’s enough Bard for me for a day.
Really looking forward to testing the Gemini Ultra model though and I will also very likely try the upcoming subscription service with Gemini enabled.