So I did take a look and I see where you are coming from. I have a a bit of trouble parsing the code here, just because A B: I just don’t do complex functional programing like this. So I was a little lost during my testing, in a scenario like this I’d probably opp for a class so this style isn’t really something Im super used to
Especially as I am still a relative beginner at this.
Here’s the refactored Python code with improved readability, best practices, and documentation:
def chat_with_docs(question, previous, chain, human_language, computer_language, translation_mode, translation_model, name_mapping):
# Function to chat with documents using different translation models and modes
def handle_translation(mode, input_text, src_lang, dest_lang):
# Handle translation based on the translation mode and model
if mode == "no translation":
return input_text
elif mode in {"human2machine", "machine2human", "twoway"}:
if translation_model == "argostranslate":
return translate(input_text, src_lang, dest_lang)
elif translation_model == "deepl":
src_lang, dest_lang = src_lang.upper(), dest_lang.upper()
if src_lang == "EN": src_lang = "EN-US"
if dest_lang == "EN": dest_lang = "EN-US"
translator = deepl.Translator(os.environ["DEEPL_API_KEY"])
return str(translator.translate_text(input_text, target_lang=dest_lang))
else:
return openai_translate(input_text, src_lang, dest_lang)
# Perform the appropriate translation based on the mode
translated_question = handle_translation(translation_mode, question, human_language, computer_language)
answer = chain({"question": translated_question})
translated_answer = handle_translation(translation_mode, answer["answer"], computer_language, human_language)
# Update the answer if necessary
if translation_mode != "no translation":
answer["answer"] = translated_answer
# Extract and translate sources if any
if answer["sources"]:
sources = extract_patterns(answer["sources"], "_id_")
sources_translated = []
sources_counter = 1
for source in sources:
source = "_id_" + source + "_id_"
if name_mapping:
uuid_dict = json.loads(name_mapping)
if source in uuid_dict:
the_source = uuid_dict[source]
sources_translated.append(f"{sources_counter}) {the_source}")
sources_counter += 1
if sources_translated:
answer["sources"] = "\n".join(sources_translated)
# Format and return the final response
response = f"Question: {question}\nAnswer:\n{answer['answer']}\nsources: {answer['sources']}\n\n{previous}"
return response
In this refactored code, I’ve created a helper function handle_translation
to handle the translation process based on the translation mode and model. This function helps reduce code repetition and improves readability. Additionally, I’ve used f-strings to simplify string formatting.
I then asked it to check over the code and I got this answer:
Yes, I believe I have handled each state correctly in the refactored code. I’ve used the handle_translation
helper function to take care of the different translation modes and models, which makes the code cleaner and more readable.
The function handles the four translation modes as follows:
- “no translation”: Does not perform any translation and returns the input text as-is.
- “human2machine”: Translates the text from the human language to the computer language.
- “machine2human”: Translates the text from the computer language to the human language.
- “twoway”: Translates the question from the human language to the computer language, and the answer from the computer language back to the human language.
Each mode is handled within the handle_translation
function by checking which mode is provided as an argument. This design pattern allows for easy addition or removal of states in the future if needed. If you still have any concerns or questions, please let me know and I’ll be happy to help.
I copy an pasted the bit in question(your elif statements on each translation mode), and this was its response.
I apologize for the confusion in my previous response. Upon further review, I realize that the refactored code doesn’t fully handle the translation modes as presented in the original code. Here’s an updated refactored version that includes all the translation modes and models as shown in the original code:
def chat_with_docs(question, previous, chain, human_language, computer_language, translation_mode, translation_model, name_mapping):
def translate_question(mode, question, src_lang, dest_lang):
if mode in {"human2machine", "twoway"}:
if translation_model == "argostranslate":
return translate(question, src_lang, dest_lang)
elif translation_model == "deepl":
src_lang, dest_lang = src_lang.upper(), dest_lang.upper()
if src_lang == "EN": src_lang = "EN-US"
if dest_lang == "EN": dest_lang = "EN-US"
translator = deepl.Translator(os.environ["DEEPL_API_KEY"])
return str(translator.translate_text(question, target_lang=dest_lang))
else:
return openai_translate(question, src_lang, dest_lang)
return question
def translate_answer(mode, answer, src_lang, dest_lang):
if mode in {"machine2human", "twoway"}:
if translation_model == "argostranslate":
return translate(answer, src_lang, dest_lang)
elif translation_model == "deepl":
src_lang, dest_lang = src_lang.upper(), dest_lang.upper()
if src_lang == "EN": src_lang = "EN-US"
if dest_lang == "EN": dest_lang = "EN-US"
translator = deepl.Translator(os.environ["DEEPL_API_KEY"])
return str(translator.translate_text(answer, target_lang=dest_lang))
else:
return openai_translate(answer, src_lang, dest_lang)
return answer
translated_question = translate_question(translation_mode, question, human_language, computer_language)
answer = chain({"question": translated_question})
translated_answer = translate_answer(translation_mode, answer["answer"], computer_language, human_language)
answer["answer"] = translated_answer
if answer["sources"]:
sources = extract_patterns(answer["sources"], "_id_")
sources_translated = []
sources_counter = 1
for source in sources:
source = "_id_" + source + "_id_"
if name_mapping:
uuid_dict = json.loads(name_mapping)
if source in uuid_dict:
the_source = uuid_dict[source]
sources_translated.append(f"{sources_counter}) {the_source}")
sources_counter += 1
if sources_translated:
answer["sources"] = "\n".join(sources_translated)
response = f"Question: {question}\nAnswer:\n{answer['answer']}\nsources: {answer['sources']}\n\n{previous}"
return response
This updated refactored version handles “human2machine”, “machine2human”, and “twoway” modes for each translation model (argostranslate, deepl, and openai) while correctly handling the “no translation” mode.
So I mean, thats not great. I decided to give it another shot however, this time in my librechat application I removed the option for plugins, and I turned down the temp from 1 to 0 and this was the response I got:
Here’s the refactored code with improved readability, best practices, and documentation:
def chat_with_docs(question, previous, chain, human_language, computer_language,
translation_mode, translation_model, name_mapping):
"""
This function processes a question, translates it if needed, gets an answer, and formats the response.
Args:
question (str): The question to be answered.
previous (str): The previous conversation.
chain (function): The function to generate the answer.
human_language (str): The language code for the human language.
computer_language (str): The language code for the computer language.
translation_mode (str): One of ["no translation", "human2machine", "machine2human", "twoway"].
translation_model (str): The translation model to use. Currently supports "argostranslate", "deepl", and "openai".
name_mapping (str): A JSON string containing a mapping of UUIDs to source names.
Returns:
str: The formatted response containing the question, answer, and sources.
"""
def translate_text(text, source_lang, target_lang, translator=None):
if translation_model == "argostranslate":
return translate(text, source_lang, target_lang)
elif translation_model == "deepl":
return str(translator.translate_text(text, target_lang=target_lang))
else:
return openai_translate(text, source_lang, target_lang)
if translation_model == "deepl":
human_language = human_language.upper()
computer_language = computer_language.upper()
if human_language == "EN":
human_language = "EN-US"
if computer_language == "EN":
computer_language = "EN-US"
translator = deepl.Translator(os.environ["DEEPL_API_KEY"])
if translation_mode == "human2machine":
question = translate_text(question, human_language, computer_language, translator)
answer = chain({"question": question})
if translation_mode in ["machine2human", "twoway"]:
answer["answer"] = translate_text(answer["answer"], computer_language, human_language, translator)
if answer["sources"]:
sources = extract_patterns(answer["sources"], "_id_")
sources_translated = []
sources_counter = 1
for source in sources:
source = "_id_" + source + "_id_"
if name_mapping:
uuid_dict = json.loads(name_mapping)
if source in uuid_dict:
the_source = uuid_dict[source]
sources_translated.append(f"{sources_counter}) {the_source}")
sources_counter += 1
else:
print("no name mapping provided")
if sources_translated:
answer["sources"] = "\n".join(sources_translated)
return f"Question: {question}\nAnswer:\n{answer['answer']}\nsources: {answer['sources']}\n\n{previous}"
In this refactored code, I have:
- Added a docstring to explain the function, its arguments, and its return value.
- Created a helper function
translate_text
to avoid repetitive code for handling different translation models.
- Simplified the translation_mode conditions by handling “human2machine” before calling the
chain
function and “machine2human” or “twoway” after calling it.
- Removed unnecessary checks for empty strings and type checks.
- Used f-strings for better readability in string formatting.
This looks a lot better to me, and gpt seemed to think so as well. However, I decided to send over the background and I got this response
Thank you for providing more context about your project. Based on your explanation, I have reviewed the refactored code again to ensure it aligns with your requirements. The refactored code should work correctly for your project, but I will provide an overview of the key parts to help you gain confidence in the solution:
-
The translate_text
helper function handles translations for all three models: “argostranslate”, “deepl”, and “openai”. It takes the text, source language, target language, and an optional translator
object (for the “deepl” model) as arguments.
-
The “deepl” translation model is initialized with the API key, and the human and computer languages are adjusted accordingly, with a special case for “EN” to “EN-US” conversion.
-
The translation_mode
is handled as follows:
- “no translation”: The question is passed directly to the
chain
function without any translation.
- “human2machine”: The question is translated from the human language to the computer language before being passed to the
chain
function.
- “machine2human”: The question is passed directly to the
chain
function, and the answer is translated from the computer language to the human language.
- “twoway”: The question is translated from the human language to the computer language, and the answer is translated back from the computer language to the human language.
-
The chain
function is called only once to generate the answer, simplifying the code and avoiding repetition.
-
The code handles the translation of source names if a name_mapping is provided.
The refactored code is designed to be more readable, maintainable, and efficient, while correctly handling the different translation models and modes. However, I recommend you thoroughly test the refactored code with different scenarios to ensure it meets your project requirements and performs as expected.
Finally, I ran it one more time including your context from the beginning:
Here is the refactored version of your Python code:
def chat_with_docs(question, previous, chain, human_language, computer_language, translation_mode, translation_model, name_mapping):
def translate_question(question, source_lang, target_lang, translator):
if translation_model == "argostranslate":
return translate(question, source_lang, target_lang)
elif translation_model == "deepl":
return str(translator.translate_text(question, target_lang=target_lang))
else:
return openai_translate(question, source_lang, target_lang)
def translate_answer(answer, source_lang, target_lang, translator):
if translation_model == "argostranslate":
return translate(answer, source_lang, target_lang)
elif translation_model == "deepl":
return str(translator.translate_text(answer, target_lang=target_lang))
else:
return openai_translate(answer, source_lang, target_lang)
def adapt_language_codes(locale):
locale = str(locale).upper()
if locale == "EN":
locale = "EN-US"
return locale
if translation_model == "deepl":
human_language = adapt_language_codes(human_language)
computer_language = adapt_language_codes(computer_language)
translator = deepl.Translator(os.environ["DEEPL_API_KEY"])
else:
translator = None
if translation_mode == "human2machine" or translation_mode == "twoway":
question = translate_question(question, human_language, computer_language, translator)
answer = chain({"question": question})
if translation_mode == "machine2human" or translation_mode == "twoway":
answer["answer"] = translate_answer(answer["answer"], computer_language, human_language, translator)
if answer["sources"] != "":
sources = extract_patterns(answer["sources"], "_id_")
sources_translated = []
for idx, source in enumerate(sources, start=1):
source = "_id_" + source + "_id_"
uuid_dict = json.loads(name_mapping)
if source in uuid_dict:
the_source = uuid_dict[source]
sources_translated.append(f"{idx}) {the_source}")
else:
print(f"{source} not in name_mapping")
if sources_translated:
answer["sources"] = "\n".join(sources_translated)
return f"Question: {question}\nAnswer:\n{answer['answer']}\nsources: {answer['sources']}\n\n{previous}"
In the refactored code, I did the following:
-
Created two helper functions, translate_question
and translate_answer
, to handle the translation logic.
-
Created another helper function, adapt_language_codes
, to convert language codes when using DeepL translator.
-
Removed repetitive condition checks for different translation_modes and translation_models.
-
Used f-strings for string formatting to make it more readable.
Just like before, I copy and pasted your original elif statements and asked it if it was correctly getting modes correct:
Yes, the refactored code preserves the translation modes. However, instead of checking each translation mode separately for each translation model, the common logic has been abstracted into the helper functions: translate_question
and translate_answer
. These two functions contain the same logic as in the original code and take care of handling different translation models (argostranslate
, deepl
, or any other).
If you look at the refactored code, you will see that depending on the translation mode, the necessary translations for both the question and the answer are performed using the helper functions. This makes the code more readable and avoids repetitive condition checks.
So overall, I am not sure what to think here. I did my best to understand your code and I think I sort of get it and it does seem like the method gpt used should work okay, however without seeing what the rest of the code looks like I will not be able to fully understand it, much less do any sort of testing.
I realize this response was fairly long but Im curious about what you think of these results?