Thanks for the amazing work!
The inconsistency is worrying. Dutch, Speed Factor: 3.1x - WER: 98.83%, but 3.3x i s 20%. So I suppose it basically is nonsense at 3.1x, but kind-of usable at 3.3x.
I wonder if there would be ways to detect the 80+% WER results, perhaps with an LLM?