Fine-tuning model on new coding language

vp12 · June 22, 2023, 8:10pm

I am working on a project where I am attempting to fine-tune an openai model on a language that is not widely known, though it is similar to SQL. The goal is to go from a natural language query to accurate code. I have trained many different models on samples, docs, code blocks, but no success. If anyone had any suggestions, please let me know.

Foxalabs · June 22, 2023, 9:08pm

Fine tuning shows the model new ways of working, new ways to “think”. It is not well suited to adding new data. You can try embedding the language documentation and then using an embedding retrieval on your prompt to generate the required context in order to solve the prompts request.

Adding new code related information to an LLM is still a technical challenge and one being activly worked on.

I hope this gives you at least a few directions to look into.

michael.simpson555 · June 23, 2023, 4:10am

heres a fun script to do the same thing no ai required just have to make the sintext and guidelines for the code its basic still needs work

theban_alphabet = {
    'A': 'ᚨ', 'B': 'ᛒ', 'C': 'ᚲ', 'D': 'ᛞ', 'E': 'ᛖ', 'F': 'ᚠ', 'G': 'ᚷ', 'H': 'ᚺ',
    'I': 'ᛇ', 'J': 'ᛃ', 'K': 'ᚲ', 'L': 'ᛚ', 'M': 'ᛗ', 'N': 'ᚾ', 'O': 'ᚩ', 'P': 'ᛈ',
    'Q': 'ᛩ', 'R': 'ᚱ', 'S': 'ᛋ', 'T': 'ᛏ', 'U': 'ᚢ', 'V': 'ᚡ', 'W': 'ᚹ', 'X': 'ᛪ',
    'Y': 'ᛦ', 'Z': 'ᛎ', '.': '.'
}

theban_to_english = {
    'ᚨ': 'A', 'ᛒ': 'B', 'ᚲ': 'C', 'ᛞ': 'D', 'ᛖ': 'E', 'ᚠ': 'F', 'ᚷ': 'G', 'ᚺ': 'H',
    'ᛇ': 'I', 'ᛃ': 'J', 'ᚲ': 'K', 'ᛚ': 'L', 'ᛗ': 'M', 'ᚾ': 'N', 'ᚩ': 'O', 'ᛈ': 'P',
    'ᛩ': 'Q', 'ᚱ': 'R', 'ᛋ': 'S', 'ᛏ': 'T', 'ᚢ': 'U', 'ᚡ': 'V', 'ᚹ': 'W', 'ᛪ': 'X',
    'ᛦ': 'Y', 'ᛎ': 'Z', '.': '.'
}

english_alphabet = {
    'A': 'A', 'B': 'B', 'C': 'C', 'D': 'D', 'E': 'E', 'F': 'F', 'G': 'G', 'H': 'H',
    'I': 'I', 'J': 'J', 'K': 'K', 'L': 'L', 'M': 'M', 'N': 'N', 'O': 'O', 'P': 'P',
    'Q': 'Q', 'R': 'R', 'S': 'S', 'T': 'T', 'U': 'U', 'V': 'V', 'W': 'W', 'X': 'X',
    'Y': 'Y', 'Z': 'Z', '.': '.'
}

def theban_translate(text, to_theban=True):
    translated_text = ''
    translation_dict = theban_alphabet if to_theban else theban_to_english
    
    for char in text.upper():
        if char in translation_dict:
            translated_text += translation_dict[char]
        else:
            translated_text += char
    
    return translated_text

while True:
    print("Translation Options:")
    print("1. Translate English to Theban")
    print("2. Translate Theban to English")
    print("3. End the program")
    choice = input("Enter your choice (1, 2, or 3): ")

    if choice == "1":
        text_to_translate = input("Enter the text to translate from English to Theban: ")
        translated_text = theban_translate(text_to_translate, to_theban=True)
        print("English Text:", text_to_translate)
        print("Theban Translation:", translated_text)
        print()
    elif choice == "2":
        text_to_translate = input("Enter the text to translate from Theban to English: ")
        translated_text = theban_translate(text_to_translate, to_theban=False)
        print("Theban Text:", text_to_translate)
        print("English Translation:", translated_text)
        print()
    elif choice == "3":
        print("Ending the program...")
        break
    else:
        print("Invalid choice! Please try again.")
        print()

print("Program ended.")

Topic		Replies	Views
Fine-tuning GPT to learn a new coding language Prompting codex , chatgpt , plugin-development , fine-tuning , api	3	3525	December 24, 2023
I feel like my tuning model isn't "learning" API fine-tuning	3	529	September 29, 2023
How to teach a new coding language to GPT? API gpt-35-turbo , fine-tuning	12	6471	September 5, 2024
Teaching GPT a new/niche programming language API	1	1820	June 2, 2023
Fine-tuning a codex model? API codex	10	2312	July 25, 2023

Fine-tuning model on new coding language

Related topics