You actually have failing audio files logged for analysis and they are understandable but can’t be transcribed?
Here I describe a re-encoding you could do, which also has the effect of recoding in voice-over-ip audio bandwidth, so if there was something like noise shaping in high definition audio, it would be stripped.