How good is ChatGPT3.5 / GPT4 translations?!

Guys if you’ve tried extensively, how good is GPT models for translation?!

I’m building an app where I need to implement translation feature.

I’m in between of using Google Translate - DeepL - OpenAI.

I’m inclined towards DeepL but since I’m already using OpenAI for text generation I’m wondering if it’s better to translate directly within OpenAI call.

What do you think?

1 Like

GPT for translation is the best option short of actual bilingual human translators and can often perform better if the translator is not fluent in technical terms used in the text.

Absolutely worth doing all 4 options (your suggested 3 + human) on a test work and evaluate the performance for accuracy of content, intent and fluency.


i’d do that but testing so many languages would be a little pain in the back.

I’ve worked on translation apps using gpt, it’s phenomenal. Best quality for sure. But you will have to deal with the latency and performance. Also, depends on the scenario, DeepL is also good enough for most cases.

Why ‘good enough’? what’s wrong with DeepL?

Test one you are fluent in and perhaps another for someone you trust, get a baseline then you can roughly approximate performance from those and the approximate size of that languages contribution to the internet… not a perfect relationship, but useful.

1 Like

contextual awareness, simple as that, deepl just has not got the inference ability of an LLM, not the same beast, so you get mistakes where context is nuanced, complex or just plain whacky as some languages can be.

1 Like

Our scenario is translating text from OCR, which has some typos and broken words, DeepL can’t help much in this situation and the LLM nails it.


Thanks for suggestion but that’s not the best solution. Generally, Google Translate is good with major languages. But when it comes to smaller ones it sucks beyond comprehension :sweat_smile:

ohhh, nice use case! Will keep this in mind✌️

1 Like

That really is a great use-case, the models have read a looooot of OCR text and have modelled the common failure modes so well, it’s spooky.

1 Like

If you do two iterations for each translation (using a GPT), you’ll get nearly perfect results.

  • First interaction with the model: translate this text
  • Second interaction: refine the text to achieve the natural flow of a native speaker’s writing

Spooky? “AI task: Improve quality”

Indeed, that’s an excellent use-case. The models have been trained on a vast amount of OCR text and have effectively learned to understand and correct common errors. It’s impressive how well they can handle such tasks.

Hi, A lot depends on which language pairs are involved. For high-resource languages you can get translations on a par with Google Translate & DeepL. For low-resource languages some results are very poor. You can check all this out with our ChatGPT Translator Plus app from Shop - MyDutchPal.

Hi, We’re running GPT4 translations on english, german, french, spanish, italian, portugese, on a daily basis, and I can assure you it’s the best available option hands down. Also we tried it successfully on russian, japanese, and chinese. GPT4 is our choice to answer our guests from all around the world. And if you work on your prompts thoroughly, there will be no need for second iterations at all.


It also depends on what you are translating for example a newspaper article vs a government document. I live in Montreal, Quebec, Canada and I use the translation frequently from French to English. Local language laws have recently made English no longer legal in the workplace, therefore as an anglophone I am struggling to do my job. I mostly use DeepL for documents and Google Translate on web pages. Neither is perfect and I have to review and make corrections. But still exponentially faster and more accurate than me doing it 100% myself.

Sometimes I find that due to perhaps local dialect, that very peculiar words in English pop up in the translation from French. Words that are so seldom used, you need to look them up in a dictionary because nobody has used that word in English since couple of hundred years ago.

Also, if you translate a newspaper article, you sometimes get local expressions in French translated directly. They have a different way of allegorical thinking from us and we just would not say it in the same way. You really need to understand the abstract concept of what the original author was trying to say and fix it. also has a blog post about that. They tested gpt-4 against google translator. They found that Gpt-4 is better in translating websites because it understands the context of the translations.

Hey, thanks for your comment.
I am now doing a PoC to pick a language provider. I am looking at AWS Translate, DeepL and OpenAI is on the way too.
What I have learnt is that we first need to understand the use case, the type of contents to translate from and to and also the languages involved.
I particulary need it for translating legal information (legislation, jurisprudence, administrative doctrine, etc.) and also non-legal content as websites, manuals, marketing material, etc.
Price is also important
PDF and other file types handling is important for me.

How do you use ChatGPT for translating? have you built an API to invoke it?
How much content do you translate a month? (we have loads…)
Is it expensive?
Do you have to translate it twice?

Sorry about the bombing!


I have been using DeepL and GPT-4 for translation. In terms of accuracy, DeepL is better than GPT-4, but GPT-4 is sometimes better from the readability point of view.
Having a two-step process with DeepL (pure translation) + GPT-4 (improving fluency) is working well. However, there are some problems regarding word selection, and it does not use the most natural ones, probably because it is very focused on literal translation. You can give more freedom to the model to explore more, but then the accuracy is going to be lower, and in my case, accuracy is key.

I was wondering if someone has tried to fine-tune or use some examples (context learning) to improve the quality of the translations.

When translating websites, we figured out that GPT-4 is superior to Google translator or DeepL. GPT-4 is better because we translate a lot of small snippets. And we need to pass the context of the text, which only works with gpt-4. Here is our full analysis: