Segmenting training data for fine-tune

The old method of fine tuning took prompt completion pairs:

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

Whereas it now takes a list of messages:

{"messages": [{"role": "system", "content": "Marv is a factual..."}...]

Maybe this is self evident, but in this latest version, are we supposed to segment the messages ourselves?

What I mean: a completed conversation is one list. In the new format, do I upload that list a single time?

In the old version, I would segment the conversation such that each assistant turn was a completion and the prompt was everything before that turn.

In this new version, it’s not explicitly clear what is happening. My interpretation is that I should not segment a conversation. But this is me reading in between the lines.

Does anyone have opinions, docs, or data on this?