“mini” models, any AI model that has a lower parameter count, simply doesn’t have as much embeddings space to encode layers of pretrained knowledge. GPT-4
is going to do a better job reciting truthful statistics about the 1922 World Series team lineups, Amiga game developers who worked at a company, or even to write natively in ᐃᓄᒃᑎᑐᑦ than the predictions that come out of more compressed models.