ChatGPT Providing Broken or Outdated Source Links

Hi @mpmpfeffer

To add to the excellent reply by @raymonddavey, let me elaborate.

The underlying large language models do not store links as “references” because the models are not “reference models” they are “language models”. This means all the many billions of pieces of data used in the pre-trained language models are used to predict the next sequence of text based on a prior sequence of text.

You @mpmpfeffer are making the common mistake of assuming that a language model is a reference model or expert system.

This is also why @raymonddavey correctly points out that these models “just make things up” which is a way of saying that these models can create URLs and Links (any text) out of “thin air” by predicting what a URL reference is, as a language model, not as an accurate reference model.

This is not accurate, @krisu.virtanen. ChatGPT has no “knowledge” it is a language model designed and trained to predict sequences of text and so it has no “knowledge” of anything except predicting some text based on some prior text.

This is not “knowledge” it is only “data” using to predict natural language sequences of text. There is no actual “knowledge” and ChatGPT “knows nothing”, ChatGPT is a fancy text auto-completion engine, generating text based on user prompts.

There is a very large and significant technical difference between “data” and “knowledge”. It important not to use technical terms like “knowledge” in place of “data”. GPT models are pre-trained language prediction models based on the body of data in the “April 2021” (need to confirm it was April) time frame. Fine-tuning these models can bring in new data, but this data will be used for predicting natural language text, and is not “knowledge” per se.

Hope this helps.

3 Likes