Chatty translated the Ruby code above to Python, as follows. Seems plausible, but I did not test it.
Chatty (ChatGPT) Translation
- Check Code Completely and Use with Caution - Not Human Validated
import re
def validate_jsonl(fine_tune_data, validate_api=True, prompt_separator="PROMPT_SEPARATOR", completion_stop="STOPSTOP"):
if not fine_tune_data:
return False
output = []
count = 0
validated_line = False
regex_to_validate = re.compile('')
if validate_api:
regex_to_validate = re.compile(r'^\{"prompt":\s*"([^"]+)\s*{}",\s*"completion":\s*"\s([^"]+)\s*{}"\s*\}$'.format(prompt_separator, completion_stop))
else:
regex_to_validate = re.compile(r'^\{"prompt":\s*"([^"]+)",\s*"completion":\s*"([^"]+)"\s*\}$')
for line in fine_tune_data.split("\r\n"):
count += 1
validated_line = bool(regex_to_validate.match(line))
output.append({'count': count, 'valid': validated_line, 'line': line})
return output
Chatty said:
Note: python use
not
instead of!
andTrue
instead oftrue
Also in python, theeach
method in ruby is replaced byfor
loop. In this example, theeach
method is used to iterate over the lines in thefine_tune_data
variable, and is replaced byfor
loop in python. And in python usere.compile()
instead ofRegexp.new()
to create a regular expression.
Hope this helps Python users.