I was testing the translation capability with gpt-4
model and the translation of simple children’s story from English to Sanskrit was inaccurate. The model appears to fail with the following:
- Punctuations: The model adds commas where full-stops are required in Sanskrit.
- Words: The model seems to sometimes use the right word for “he” (सः) and other times, it uses a wrong term स (notice the missing :). At a broader level, it is appears to be confusing Hindi with Sanskrit and hallucinating words that may not exist in either language.
For example, I translated this sentence (from some story)
"At once he went back to his village and returned with a glass full of milk. "
ChatGPT-4 gave me:
सः सद्यः स्वग्राममपक्रम्य पूर्णदूधग्रहणं प्रत्यादायत्।
Google Translate gave m:
सद्यः स्वग्रामं गत्वा क्षीरपूर्णं काचम् आदाय प्रत्यागतवान् ।
Google’s translation in this case is much closer to what I wanted in this case.
Here’s the prompt that I had used:
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are an expert in Sansktit language. You specialize in wirting with simple Sanskrit that is suitable for young children."},
{"role": "user", "content": f"Can you translate the following story to Sanskrit?\n{story}\n\nReturn just the translated story and no other additional text."}
]
)
Could I have prompted and made this work better?
Regards,
Akshay