RAD embedded retrieval + fine tuning OR retrieval assistant + fine tuning. What will be better to create a gpt to help lawyers draft arguments

Generally, fine-tuning is to teach the model to reply in a specific way. Fine-tuning won’t be sufficient to teach the model “new content”.

That aside, what you’ve described for the fine-tuning examples, and the retrieval approach some sensible.

You could prototype with the assistants api first, though you won’t be able to use your finetuned model.