Wanted to know if LLMs have blind spots while doing model graded evaluation

For model graded evaluation, can I use the same model to evaluate the response generated by it. For example if GPT 4 generated a response, can I use the same model to evaluate it

Does model have blind spots?

1 Like

Yes!

Yes!


Some might refer to this as a component of the “Reflection” pattern, where the model considers its own hypothetical output before coming to a conclusion.

The models do have blind spots. Here’s what I observed and remember off the top of my head:

  1. The models seem to have been trained to not contradict themselves. Mitigation: present the conversation or the model output as a user input, or some third party text.

  2. The models sometimes don’t see their own last outputs. Mitigation: same as 1.

  3. The models have limited attention. This is a more complex topic, but can almost entirely be mitigated by keeping your context and responses short.


But yeah, it’s definitely possible, and most advanced systems (that I know of, and consider advanced) use some form of this.

1 Like