Why does fine-tuning not work but Assistants do?

Hi and welcome!

I would think that GPT 3.5 Turbo base model would already have the knowledge to do that spelling translation already so maybe fine-tuning it would not really have much of an effect.

Fine-tuning is very useful when you have output expectations that are not consistent, you can fine tune your cases to be more consistent by providing fine-tune examples.

I am no expert myself though my guess it the base knowledge is good enough for your use case.