Hello everyone,
I wanted to bring up an issue I’ve encountered while using GPT-3.5-turbo models, specifically comparing the performance of GPT-3.5-turbo-0125 and GPT-3.5-turbo-1106 in Chinese language tasks. Upon experimenting with both models using the same prompts, I’ve noticed a significant difference in their performance, particularly in keyword extraction for Chinese content.
In my tests, which involved around 100 text samples, GPT-3.5-turbo-0125 consistently exhibited inferior performance compared to GPT-3.5-turbo-1106 when it comes to extracting keywords from Chinese content. This has raised concerns regarding the reliability and suitability of GPT-3.5-turbo-0125 for tasks involving Chinese language processing.
I’m curious to know if anyone else has encountered similar issues or if there are any insights or suggestions on how to address this discrepancy. Your input would be greatly appreciated.
Looking forward to hearing from you all.