Hello everyone!
I was playing around with finetuning and had a section where I was teaching a model using the deprecated ADA to return an API address when given an EAN 12-13 barcode. Here’s an example code:
{"prompt": "8713747518917", "completion": "/api/art?bar=8713747518917&"}
{"prompt": "311601174863", "completion": "/api/art?bar=311601174863&"}
{"prompt": "8896890217601", "completion": "/api/art?bar=8896890217601&"}
{"prompt": "855395958448", "completion": "/api/art?bar=855395958448&"}
With a simple sample of 100 training and 20 validation, ADA was able to understand it perfectly in multiple fine-tunings. However, I’ve been testing with the new babbage-002 and the trainings are not favorable; it’s having a hard time understanding the structure. It specifically struggles with generating the correct EAN numbers in the completion.
Has anyone else experienced this? Especially, has anyone faced similar issues with number generation?
Just wanted to share my thoughts, and I’ll keep trying other approaches.