Fine-tuning GPT for Direct Translations of an Old Indigenous Language - Seeking Advice

theguaz · May 5, 2023, 9:54am

Hello fellow language enthusiasts and AI experts,

I am currently working on a project to fine-tune a GPT model for direct translations of an old indigenous language, specifically Mapudungun, which is spoken by the Mapuche people in Chile and Argentina. I am seeking advice on the best ways to create a dataset for fine-tuning and whether the available resources are sufficient for training the model.

I came across a dictionary available at the following URL, which provides translations between Mapudungun and Spanish:

My questions are:

Is this dictionary sufficient to create a dataset for fine-tuning the GPT model, or do I need additional resources?
What is the best approach to creating a high-quality dataset for training the model, considering the limited resources available for this language?
Has anyone here worked on a similar project, or seen examples of fine-tuning GPT for direct translations of lesser-known or old indigenous languages? If so, please share your experiences and insights.

I appreciate any suggestions or guidance on this topic, as I believe that preserving and promoting the use of indigenous languages is of great cultural importance. I look forward to hearing from you and learning from your expertise.

Thank you in advance!

neil.johnson · December 2, 2023, 1:44am

Did you get anywhere with this? I’d like to do similar with some languages that have several grammars and dictionaries available.

MachiRuelvillu · May 19, 2024, 5:06am

Mari mari peni, i have been also working on a model of mapuzugun! Chamultay peni!

Topic		Replies	Views
Best practices for a unique translation task: Old Hawaiian Text to Modern Hawaiian Text Community api	3	1062	July 12, 2023
Translating with GPT3 Community	1	3024	February 10, 2023
Teaching Gpt4 a new language (Yiddish) GPT builders chatgpt	5	588	November 18, 2024
Fine-tuning GPT to learn a new coding language Prompting codex , chatgpt , plugin-development , fine-tuning , api	3	3543	December 24, 2023
Creating a Dataset for Translating for Indigenous Languages in Latin America Community	6	737	July 9, 2025

Fine-tuning GPT for Direct Translations of an Old Indigenous Language - Seeking Advice

Related topics