Custom GPTs cannot even retrieve information from its custom knowledge?

I have started building custom GPTs since November, and given that my dissertation is about designing educational GPTs, I believe I have taken upon the mission to learn everything there is to know about custom GPTs with quite seriousness. I have also spent countless hours learning and tinkering on my own to perfect the GPT’s responses. Most AI gurus depict ChatGPT as the best invention since sliced bread; however, at times I feel they are more concerned with appeasing their sponsors than depicting reality.

Anyone who has built a custom GPT knows that, irrespective of the format or clarity of one’s instructions, custom GPTs have a tendency to revert to their default instructions. To those who doubt me, ask it to generate all its responses using exclusively British English or to provide full unredacted URL links that include their protocol, domain name, etc. It might do it once or twice, but after that, it’s back to hallucinating links. I ask, what sense is there in having the facility to provide instructions when the default weights make one’s custom instructions useless?

Another irritating aspect of custom GPTs that makes them practically useless in an academic setting is their tendency to randomly generate lazy (partial) responses, making them unreliable for education. Imagine prompting the GPT to provide a full and comprehensive list of learning outcomes for grade 11 science, and it generates only a fraction of them when its knowledge file contains all the learning outcomes. Why does it do this? My guess is that there is a default setting meant to conserve energy while generating (an acceptable) response.

Another flaw I find is its inability to retrieve URL links that are in its instructions and use them in its generated output. Either it hallucinates totally random URLs, or it randomly uses an unrelated URL from its knowledge base. Finally, there is what I call ‘digital dementia,’ where the quality of responses deteriorates over time—something mentioned in various posts and something I brought to the attention of OpenAI, but to date I have received no response.

On a hopeful note, custom GPTs hold a lot of promise in the educational sphere. Never have educators been given a tool to design programs and apps that make sense in their context and from their lens. It is truly a shame that they do not live up to their promise.

3 Likes

Yes, it’s true, it’s a pity that GPTs don’t live up to expectations. I also haven’t had the experience I hoped for when using GPT with the latest version of a Python library :pensive:

Yes but I have also noticed absolute confidence and flat out ignoring the key points in a prompt. Explicitly stating not to do something over and over again, while understanding that it was set to a certain way that would have been made aware prior to sending a response. My guess is that when you are using a GPT excessively or there are some cache issues it resorts to a certain technique which does not allocate resources properly and ends up what is best described as “wires crossed”

Thanks for your reply. I understand that explicitly stating not to do something is far less effective than asking it to do it and how to do it. I have tried multiple iterations of instructions and prompts in different GPTs and if it starts following the instructions, over a period of time it will still revert its default setting, there is no consistency. I don’t think it is a question of cache as this performance degradation is a recurrent issue with multiple users.

It can be frustrating, I know. ChatGPT and LLM-based systems can be flaky, surprisingly stupid at times, and genuinely seem to get lost in long sessions, becoming dumber and making more mistakes as things go off the rails.

That being said, ChatGPT is already a useful tool in its current state and is only destined to improve as issues are found, diagnosed, and capped off. I use it daily now, and it is a hugely useful assistant, dumb as it is, for coding. It does the grunt work typing, has knowledge of APIs that exceeds my own, and despite the frustration and missteps, it can produce working ‘hacks’ that I find useful.

Getting good output from ChatGPT is a skill that must be developed. It is possible to get something useful from it, but right now you have to be sufficiently expert in the area to recognize when it has gotten it wrong and correct it.

In areas where I have tried ChatGPT and other AI systems, one of the most frustrating things is the fact that the human data it trained on is poor. The majority of human-written source code is simply dreadful, filled with very poor habits, mistakes, and often simply does not work. The wide net that the training casts gives it many advantages and genuinely useful emergent properties. However, it is also indiscriminate when it comes to my area of expertise. Particularly vexing is the parroting of arguments in favor of things like global variables, multiple points of exit, sloppy constructs, etc.

I recently wrote this on Reddit:

“As with other languages, lots of C source, perhaps most, is badly written. Old school rules of thumb are still ignored by most programmers partially because they don’t know them, but also crazily because they disagree with them. Global variables should never be used. Functions should do one thing well. Code blocks (like functions) should only have a single point of entry and a single point of exit. Resource allocations (memory is just another resource) should be explicit and deallocated essentially as a single allocate/use/deallocate construct. If you ‘get’ something, you should also ‘unget’ it. I never use assert() because it violates ‘single point’ and possibly deallocation and other cleanup. It interferes with graceful error recovery.”

There ensued, as I predicted, some arguments for when to use ‘assert()’ by people who I am pretty sure have less than my four decades of programming behind them.

For people training models: I don’t know if they express it this way, but I think some have recognized that Pareto distributions mean that about 4/5 of the training data is poor and should be stripped out. If it is not being done already, I would suggest using things like Google’s algorithms to assess the quality of data by the nature of references made to it.

Anyway, warts and all, ChatGPT is a useful tool and gets more useful as you gain skill with ChatGPT prompting and with your knowledge and skill with your craft.

Wow, as if someone took a baseball bat, went into my GPT and thrashed it!

I had a custom GPT that used to work fairly well, searching multiple pdf files in its knowledge. I haven’t used it since 4o and 4o mini took over ChatGPT. Now, my custom GPT is close to useless. I tried reducing the files in the knowledge to a minimum, but it is struggling with even the most basic test queries I have.

Another backward step for Custom GPTs. I won’t be surprised if OpenAI sunsets them soon… There is absolutely no improvement for months, even degradation of capabilities.

I fully understand the feeling! I cannot understand the business model OpenAI is adopting. It’s not as if it doesn’t have any competition.

I feel that mine is operating pretty smoothly. I do notice the longer the convo and how it can revert back to its old ways but I dont seem to have these issues

I have noticed the same. I built a knowledge base that is currently only 7 pages long, because I’m still building it. While testing I explicitly gave the instruction that under no circumstances there should be information used that’s not from the knowledge base without notifying the user that it’s not from the knowledge base. It’s okay if stuff is made up and not from the doc, but then it should clearly say (I put this prompt in the instructions): “Unfortunately I can’t find any information about this in the knowledge base, but here is a answer that might help you:”

Still it spits our random made up replies that are somewhat true, but not completely, because it’s not from the doc. How can I give anybody access to this GPT, if I can’t trust it, that it won’t make up random stuff on the fly. That makes the whole point of it pretty useless

1 Like

You ever notice in a long thread the responses can be fantactic and then if you throw a “what if “ about any core concept or idea you’ve been discussing and responses seem to forget all the document materials and instructions you’ve provided. I think this is what is happening with every GPT “upgrade” to the entire program… it changes thing unintentially…. This would make sense. Hopefully, they get it worked out.