Awful results with fine-tuning (legal docs)

I definitely didn’t consider translating spanish to english, mostly because my texts are in portuguese :sweat_smile:
On a more serious note, considering these are legal docs with lots of important and strict vocabulary, I wouldn’t risk messing it up with translations

Truee, I put the examples into google translate and it detected spanish lool. Strict vocabulary should actually make translation easier, not worse, as there is less ambiguity. You can easily experiment and compare performance, depends on the kind of inference you’re performing.

Hi @nunodonato , did you manage to solve this issue of fine-tuning IRS legalities of Portugal?

I started a topic that might interest you: Is fine-tuning the way to go to generate legal opinions (law technical reports)?

If fine-tuning didn’t work well, maybe a workaround to make it work - and to avoid paying a lot by loading chat history (your legal dataset) through openai API - would be to use ChatGPT web app through Selenium for instance.

Fine-tuning is not the solution for this. Better to use embeddings search and then add the best results to the prompt