Factual Information via various models within an API Assistant

I opened a new InPrivate web browser with no login, session and asked ChatGPT the following question:

List the regulations that a marine vessel would be in violation of within Title 33 and Title 46 of the Code of Federal Regulations (CFR) if the survey inspection report contained the observation that “No throwable PFDs were present onboard” Do not provide any extraneous text. Do not provide the full text of the regulation. Only respond with the Code and Title through the first period, (i.e., “33 § 159.83 Level indicator.”)

It responded:

33 § 175.15 Throwable PFDs
46 § 28.215 Personal flotation devices.

46 § 28.215 is actually “Guards for exposed hazards.” and does not pertain to my question.

I then created FOUR assistants in the API, all with a 0.1 Temperature and 0.9 Top-P. I asked the same question on new threads with four different models. Here are the responses:

CFR Locator Assistant [gpt-4o-mini]

33 § 175.15 Personal flotation devices.
46 § 25.25 Personal flotation devices.
46 § 28.110 Personal flotation devices.
46 § 31.10-13 Personal flotation devices.
46 § 46.10-5 Personal flotation devices.

Other than the first result, the rest are incorrect. Garbage response. Borderline retarded. These codes don’t even exist. It just fabricated them.

Next I asked the gpt-4 assistant on a new thread. It responded.

CFR Locator Assistant [gpt-4]

33 CFR § 175.17 Equipment requirements. 46 CFR § 25.30-20 Life preservers or other lifesaving equipment.

46 § 25.30-20 is actually “Fire extinguishing equipment required.”
When I told it: 46 § 25.30-20 is actually “Fire extinguishing equipment required.” it replied:

CFR Locator Assistant [gpt-4]
Apologies for the confusion. The correct regulation is:
46 CFR § 28.110 Personal flotation devices.

Are you serious? This implies it knowingly responded INCORRECTLY the first time. But it was still wrong:

User

That is still incorrect, as that pertains to PERSONAL flotation devices not THROWABLE.

CFR Locator Assistant [gpt-4]

I apologize for the confusion. The correct regulation is:

33 CFR § 175.15 - Throwable personal flotation devices.

User

33 § 175.15 is “Personal flotation devices required.”

CFR Locator Assistant [gpt-4]

I apologize for the confusion. The correct regulation is:

33 CFR § 175.16 - Throwable personal flotation devices.

Still wrong. It’s complete retarded.

I then asked the gpt-4-turbo assistant. It responded with only one results:

CFR Locator Assistant [gpt-4-turbo]

33 CFR § 175.17 Throwable personal flotation devices.

33 § 175.17 is “Exemptions.” wherein it explains that canoes are exempt from throwable PFDs.

Finally, I asked the gpt-4o assistant which responded:

CFR Locator Assistant [gpt-4o]

33 CFR § 175.15 Personal flotation devices required.
46 CFR § 25.25-5 Life preservers and other lifesaving equipment required.

Both are incorrect. The correct answer is:

46 CFR § 28.115 Ring life buoys.

(a) Except as provided in paragraph (b) of this section and § 28.305, each vessel must be equipped with a throwable flotation device or a ring life buoy as specified…

I don’t understand. What is the point of this product if it doesn’t have access to public government documentation and regulations, can’t think, can’t comprehend, can’t remember, is incorrect more than 50% of the time, and then lies and fabricates answers in it’s eagerness to please???

The API is a platform for direct access to AI models for a product you must develop.

ChatGPT has web search enabled.

On the API, web search is a specific pay-per-use internal tool that must be enabled and employed via the Responses endpoint, or a special “always searching” Chat Completions gpt-4o model.

An application programming interface (API) is an interface for programming applications. It requires understanding of the facilities offered, or then those that you need to extend yourself, that will suit a particular end-user product you intend.

An assistant is a layer on top of the responses endpoint. It simply stores instructions but creates threads and sends message to a specific model.
The issue at hand are the models and the responses they provide.
I’m already a subscriber. I already pay for the service. My API libraries query my assistant and rewrite content beautifully. But my assistant also attempts to provide recommendations based on observed deficiencies on marine vessels.
If a model does not have access to certain data, it should not respond with fabricated and erroneous data. WTF is that all about? How is that a good business decision? It’s like Oracle selecting any data from any row from any table and saying… don’t care what you asked for… this is what you’re getting.
The PLAYGROUND is a test bed for the Assistants using the underlying API. These queries charge me the same as if I was requesting from another machine. There is no difference. The fact that every model doesn’t have a snapshot of public-facing government regulations that are 30-years old is pathetic. But more concerning is that every model fabricates erroneous information from the EXACT SAME question.
That can’t be explained away with a bot-generated word salad response.

You really should look into how language completion large language models work.

They are token predictors. Training from a large corpus of works in pretraining influence the prediction of words and word parts.

They are not a database search for instant reproduction, encompassing 30 years of regulations of every world government or the entire worlds knowledge or how many houses in Amsterdam are blue (add them up), wholly somehow contained in the memory of GPUs in a server.


Fun with impossible facts.

1 Like

I don’t see how that’s relevant.
The OFFICIAL number of people living in New York and actual numbers are both factual.
According to chatgpt-4o-latest 33 CFR § 159.123 is:

33 CFR § 159.123 - Sewage tank level indicator
Each sewage tank on a vessel must be equipped with a means to indicate visually the amount of sewage in the tank unless the tank is emptied by a pump that is equipped with a device that prevents overfilling.

When the actual code is:

§ 159.123 Coliform test: Type I devices.

(a) The arithmetic mean of the fecal coliform bacteria in 38 of 40 samples of effluent discharged from a Type I device during the test described in § 159.121 must be less than 1000 per 100 milliliters when tested in accordance with 40 CFR Part 136.

(b) The 40 samples must be taken from the device as follows: During each of the 10-test days, one sample must be taken at the beginning, middle, and end of an 8-consecutive hour period with one additional sample taken immediately following the peak capacity processing period.

In my instructions for my Playground Prompt, I provided:

List the regulations that the cover the inputted text within Title 33 and Title 46 of the Code of Federal Regulations (CFR).
The response should only contain exact, precise, verbatim, authoritative Code and Heading followed by the full regulation text in the following format:

CFR § . - Only provide exact, precise, accurate data from the authoritative CFR, not synthesized summaries or outlines. Respond with "No violations found" if no CFR pertains to the inputted text.

When I type in:

sewage tank level

It responded with the completely manufactured nonsense “33 CFR § 159.123 - Sewage tank level indicator”

When I tell it is was wrong and to only provide EXACT, PRECISE text, it apologizes and says it is unacceptable and not appropriate response for enterprise environments. FINALLY… it provided something accurate!!!
Other than a virtual girlfriend, this technology is unusable in real-world situations unless uploading megabytes and gigabytes of data in each request. And even then, it is inaccurate. All in a ploy to make to run dozens of requests to charge dozens of times the amount of tokens.
Sad. If government agencies and large corporations are relying on AI to perform mission-critical functions, it’s no wonder they’re so inefficient.
OpenAI needs to come clean and warn people ahead of time that these services are not suitable for anything other than entertainment.

“Warning: solutions to be implemented by those with an understanding of technology and the need for grounding AI models if seeking concrete yet efficient recitations of documents as if by a document retrieval system. Contact a solutions provider if unfamiliar with a path forward.”

1 Like