When we build a multi-round conversation dataset, we nede to paste the Q&A prompts, which is time-consuming. Is there any tools for better making it?
2 Likes
what’s the definition of that?
A dataset for fine-tuning, like:
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
2 Likes
Got it … thanks …could you use Excel (or Libreoffice) I wonder, and just enter the conversation and have it build the correct data format?
(it would make sense for that tool to already exist )
1 Like
I have same idea as you… But there isn’t… Maybe everyone thinks in that way
2 Likes
There are various labelling tools, not sure how suitable they are for this particular use case, but I know that Label Studio has support for creating Q&A datasets (disclaimer: I haven’t used it myself for this though).
1 Like
i use datasetai.com
can test finetunes right there
1 Like