Learning/Education (note that I enclude search under this category)
Glue code generation (this is actually huge, though not everyone can leverage it)
Doc/Code Review (good, not great, but good is enough for review)
Encode/Decode, especially one off / ad hoc (edited to add)
Translation, natural or code (Edit to add)
Data annotation for training cheaper systems
Some folks say summarization, but it never works for me. If something is worth my consideration itās at least worth skimming which is always better than the summaries (GPT4 doesnāt know what I donāt know / need to know) if not reading outright.
I only include glue code generation, because the cut off date makes other types of code generation not very competitive for me.
Iāve seen some other use cases being marketed, but tbh, I believe there are dedicated SaaS which do a better and more reliable job than anything they can do.
Edge cases will arise in all of those apps Iāve seen and I am skeptical GPT4 can reliably handle them.
Maybe there is a case to be made about having less functionality but a simpler interface, but I remain unconvinced so far. That just seems like a recipe for Enfeeblement.
After the last week Iād add porting.
I just finished an initial port of @stevenic Alphawave from typescript to python. Given how little I know of typescript and how many lines were ported, I estimate it would have taken 3-5x, at least, had I attempted to do it by hand , even writing a bunch of emacs macros. Maybe there are pre-built packages out there that could have done it even faster, but the ability to interact with gpt, which is knowledgable on both sides of the port, was pretty handy.
Yes, translation is interesting for sure. And, yes, SOTA is not always strictly required.
That said, I am beginning to wonder if companies which lean on GPT4 toolsets are at risk of undermining the skills of their employees, resulting in the enfeeblement issue above.
One might argue that it leaves us more time to do āhigher levelā work, but I find the lower level tasks are like whetstones, they help grind and sharpen oneās expertise. In programming there actually arenāt that many opportunities to do lower level tasks, if architecture is done correctly.
Itāll be very interesting to see if the scaling laws hold or if we are at asymptotic capability. Even odds, I would say scaling laws will not hold and we are bumping up against the limits, absent a breakthrough.
If the intuition is correct that GPT4 is largely the result of a powerful stochastic parrot, I donāt think itāll get much smarter than the average of its training data.
I built this app for querying numerical databases football in this one.
But I see LLMs are drivers for tools.
For example in my field of Sports Analytics ā A manager can start searching for players without his data analyst first. Because he can just ask who is the best player in the league taking into considerations these stats.
Also I will generate maps and graphs from these in the upgrades.
I think there are MANY things LLM can do reliably ā if we treat it as not the whole system but a part of the system. Or at least break them down to different components/agents - with differing temperatures and all.
Imagine a CRM / lead management SaaS app that gives you deep visual dashboards, tracks tonnes of information about your customers, and really allows you to do sophisticated query and lead curation.
Now, say someone wants to replace that with a prompt - āGive me the top 10 best leads.ā
The argument that someone made was that LLMs are smarter than most people and while it canāt do sophisticated / expert level work it is āgood enoughā and at least provides a reliable baseline of competence.
Yes, but for many people who donāt have access to, or more likely have no idea how to exploit the more advanced capabilities of, the CRM tool you describe?
Of course, the same could be said about LLMs as tools. How many ppl can write a decent prompt?
Imagine that same CRM ā with all the dashboards and everything, but instead of clicking and choosing which team and what sections you need highlighting etc
You just ask for it and all the selection is done for you. Thatās a superior experience to a CRM interface IMHO. @bruce.dambrosio I think āA simple prompt by the end userā is the AIM.
To make it as simple as possible for the end user.
BUT in the backend the prompt is appended to other parts of the prompt which is way sophisticated.
For example
āGive me the top 10 best leads.ā
Would be converted into something like
āusing the columns past revenue, total number of sales, and proximity to x(you decide these factors earlier) - which are the top 10 best leadsā
and Python code would be generated to run the crm Heck via selenium if needed.
Yeah, I dunno. When I first got access to GPT4 I was really excited about how I could use it to expand my mind and do things I had not done before.
A good example is spreadsheets and the first computers.
I find too much is being done that just seems to replace things we can already do. I just get depressed when I see that. Probably just me.
There are some applications, like medical, where replacement is uplifting because equitable access to this is so important. But other things, I guess. A more productive economy will benefit everyone I suppose.
GPT4 is what you make of it tbh.
There are many exciting use cases ā try to look around at projects coming out on Youtube, Reddit etc Generating code alone is groundbreaking (did excel do that?) ā but we are in Tiktok Era where we need daily stimulation of new things.
The alphadev paper that just came out, something I know folks have already been working on (at least I have) is interesting, but in general I find code generation capabilities to be very mediocre at best and not particularly interesting. The cut off date is really bad, and even without it, itās very hard to generate anything novel. Worse, GPT4 can barely recognize if something is novel nevermind generate it.
Medicore code, believe it or not, can have negative value. Security is one of the primary reasons for this, but lack of maintainability is another. All too often projects will crash and burn because they are poorly architected.
I will agree there is some value in POC generation, however. But you learn a lot doing POCs and auto generating them might be just more enfeeblement.
Future AI/AGI might be a different animal for sure, but there are two questions I have. One, is we may be more asymptotic than folks realize, and two such a world will look entirely different than what we are in right now. So different, that all prior discussion will be barely relevant.
For that latter reason I donāt really bother thinking about it.
Boiler plate text generation - I know everyone talks about this with 3.5 forwards but itās a lifesaver for things that are text heavy like legal contracts, text-based adventure games, essays, etc, esp. knowing that it can come out grammatically correct and without error in many languages.
Search replacement (i.e. ranked list) ā again quite well known but it is basically displacing Google since you can ask it for the ābest 5 moviesā on a particular topic or the ā10 most knowledgeable people ranked by likeliness to respond.ā GPTās ranking isnāt perfect but it is vastly superior to any other product.
Glue code generation ā This is huge esp. in obscure programming languages or other areas where you can go 0 to 60 in one prompt. I think the stitching between snippets will get better with the growth of LangChain and other tools.
Brainstorming - Yes, largely because of the ranking function and access to disparate data sources.
Learning/Education - probably more pronounced once you have more interactive modalities. Iām still not sure what learning is best performed via a text prompt.
Doc/Code Review - yup, including rating of plans or asking to āconsider ways this letter could be better.ā
Search replacement I sort of conflate with learning. Itās great for finding things I am unfamiliar with and just getting into, but for stuff that matters and I need precision, I still rely on keywords and skimming extensively.
encoding/decoding is another one I missed for sure. I use that a lot. eg, format this slightly jumbled info into a markdown table. Thatās pretty awesome.
I think the big story thatās going to come out is the danger of enfeeblement. You learn a lot of things when you do knowledge work that sharpen your skills. Tools that empower that knowledge work like calculators, spreadsheets and databases are good, but anything that replaces reasoning ā¦ connecting ideas together ā¦ is going to make folks rusty and less valuable to their employers.
Boilerplate text generation is fine, as long as you know exactly whatās supposed to be in the boilerplate and itās just auto-completion.
The ultimate result, which ChatGPT produced when asked to rewrite it at a fifth-grade reading level, began with a reassuring introduction:
If you think you drink too much alcohol, youāre not alone. Many people have this problem, but there are medicines that can help you feel better and have a healthier, happier life.
Doctors are emotionally traumatized by a lot of the things they have to communicate to patients. This could be very compelling to have an AI that doesnāt get traumatized.
Basically this sort of goes to the other proven cases of LLMs ā¦which is companionship and bringing it around to a more utility use case.