Gpt-4o is really bad at NER tasks

saaspeter · May 6, 2025, 9:08am

I am trying to extract some medical symptom entity from the user’s question, but I found gpt4o is unable to extract these entities, I was surprised at the result, then I tested other platforms(e.g: gemini and other platform), all they works well, I then try gpt-4.1-mini, it can give the answer. So what’s wrong with gpt-4o, for LLM, this should be a basic question, but gpt-4o cannot give the answer, I feel that my money was wasted.(although it is very low price for my tasks.)

you can see the snapshot for my task. (Comparing gpt-4o and gpt-4.1o). Should I use more expensive model gpt-4.1 or use another LLM company api?

below is gpt-4.1-mini

Topic		Replies	Views
GPT 4o mini performing much worse than GPT-3.5-16k Bugs	0	205	August 18, 2024
Gpt-4o-mini has terrible results in comparison to gpt-4o on text summarization task? API gpt-4 , gpt-4o , gpt-4o-mini	8	7038	August 13, 2024
Gpt-4o-mini can't parsing complicate function callings result but user message parsing is good API gpt-4o-mini	0	120	August 21, 2024
GPT-4O moron all of the sudden Prompting gpt-4	2	397	October 28, 2024
GPT4o doesn't output anything API gpt-4	2	137	April 30, 2025

Gpt-4o is really bad at NER tasks

Related topics