Feature Request: Export current chat messages from Playground to fine-tuning .jsonl file

Hi guys,

It would drastically simplify fine-tuning workflow if we could do the following in the Playground:

  1. Export all current chat messages as a single line ready for .jsonl file-tuning file in the following format:
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}

With ability to exclude system message (checkbox, default with system message):

{"messages": [{"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
  1. Export current chat messages single pairs user/assistant in the following format:
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

With the additional option to exclude system messages from pairs:

{"messages": [ {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}
  1. Export the last user/assistant pair of messages with the option to exclude the system message:
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}

And respectively:

{"messages": [ {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}

Not sure if that’s hard to implement but that would definitely help save time in generating fine-tuned samples

PS: specifying a target file from files UI would be awesome so that you can preconfigure your export options, select a file, or create a new one. Then, once the chat is set up - click a button to add the last pair of messages (or other items from the list above) to your training file. Go through your messages, edit assistant responses as needed, and add data to the file. Once finished, go to fine-tunes and use the file via the native UI…

1 Like

@logankilpatrick Hey Logan, maybe you could help me get some idea on whether this is something potentially ā€œimplementableā€

Heh, I got gpt4 to do it for me…

I used code interpreter, built the scripts up slowly testing and making plans along the way, took about 20 minutes from it dawning on me to a working version.

1 Like

You mean you have working code exporting the chat messages from playground?

The thing is that there are some tools aiming to allow generate training data from chat interactions. In OpenAI playground the whole base is ready to do it, just a minor tweak in UI, and don’t get why it’s not done yet because the ROI of this feature is enormous.

That’s not really how the playground is set up as it doesn’t have any state management for conversations. It wouldn’t be too hard, though, off the top of my head you’d have it write a small script to store the messages to mimic that functionality, but you’d have to export after each session and manage them all.

I use my regular chat history, chatgpt and assistants, then put it into a gui tool for curation. The bulk of my conversations that are useful for training are in a gpt, which exports. For me it’s: Export history, load into tool with conversion scripts and gui, curate, export, train. Couldn’t be easier, you could go direct to train without curation if you wanted and that’s just 2 file copy operations. For a raw dump it would be: Download history, run script, upload to training. Hotfolders if it needs to be automated.

@moonlockwood Oh, I see. For me personally, it’s no issue for bigger projects, because I use my own PHP client with full log and debug, so I have the whole history parsable.

I’m taking about quick and easy way to just jump to a chat playground (not assistants, but might be worth the try too) and create a decent ā€œseedā€ dataset manually in 10 min to stick it into API calls for large amount of training data generation. Literally, this feature would definitely stop me looking into Google’s ai studio and langsmith. As all I want is 20 great samples+ gpt4+API to analyse another 2k samples of data to generate the first cool fine-tune for pretty much any task. If what I asked above is doable, this can be accomplished under 10-20 min…

Got you, very similar to what I’m doing. The way I’m using gpts these days that woud be a perfect application. Just literally tell it what you just said, I would not be surprised at all if it just worked.

One thing that really does surprise me about the machine learning revolution is that I sit around thinking ā€œhmm, what about this? Would that work? Or that… what about that?ā€ And if you just type it into gpt4 it not only answers the question, it often goes ahead and does it. Like my mind can’t imagine that it could do it so I don’t bother asking. Slowly breaking out of that and remembering to just ignore that it seems absurd and type it in. Regularly amazed at what gpt4 can do.

(It’s fine tuning colloboration has been insane. I explained what I wanted to do and we built a huge perfectly tailored synthetic dataset all in my voice. Told it the set up and then we had ā€˜dummy’ chats where it responded in special ways, it then saved everything to files and used that as templates to generate reams of awesome data)

Ahhh I see the confusion, I read that as conversations… Just the current chat messages to jsonl is straightforward.

ā€œPlease write a script to export every message in the current chat to a jsonl file that will be used for fine-tuning an LLM. Do you need to see a sample of the format?ā€

That’s where I would start. The more you make it write files the more it gets it’s act together. You could raw dump the conversation then convert and it would probably add a layer of error handling and cleanup.

Recently I had to perform 8 step linguistic analysis on 3k posts and 5 out of 8 steps needed pretty complex samples submitted along with data to analyse to get the results of the quality needed for the task. Literally copy pasting after editing bot’s replies was the longest part of the manual work done on this. That’s what I’m talking about, export of chat messages would same me the day. BTW fine-tuning + initial samples generation was close to 600 USD. And as a customer I have a somewhat ā€œbitterā€ taste of those copy-pastes. That’s why tools like langsmith or goggle ai studio become suddenly less repulsing.

How does it applicable here: https://platform.openai.com/playground

Yeah, that’s no good, there’s much better ways. I have background in automation, that approach is like nails on a chalkboard to me.

Seriously, type all of this into gpt4 and it will tell you everything I would, including implementation, plus it’ll probably be easier to understand and better informed. There is a clean and simple solution to your problem with a gpt and a couple of scripts in the right places.

Interesting, things have been changing a lot in the past 4-5 days. I have never seen it complain about reciting messages. Could have something to do with all of the data extraction techniques people are using recently and openai trying to prevent it from looking inwards or backwards.


This got me concerned that I had misunderstood the state of play and what is currently possible.

And then it does this… As perfect as it gets.

(I swear that’s not staged!)

I see, thank you. That explains a lot. But the whole feature request has nothing to do with that.

Maybe some analytics showed the feature was not used enough and thus presumed ā€œnot popularā€ā€¦ Hope the same analytics tool also had some data about the number of page views of the feature announcement. Because one without the other does not make sense.

Analytics would not show how much someone enjoyed getting the system and user messages.

It was working for messages under 48 hours ago when I wrote this.

Perhaps the code copy button overflowing with messages was seen as a distraction from the purpose of getting the basic form and the parameters employed.

And now the view code have the messages field empty… The story might push a lot of users to Google/others just for the sake of fast edit/export messages possibility.

And now the view code have the messages field empty… The story might push a lot of users to Google/others just for the sake of fast edit/export messages possibility.

The reliability and functionality of the api and internal workings seem to be shifting dramatically day to day at the moment. Most of the interesting things I am working on are on pause until things stabilise.

Sorry to confuse the issue. The point is, what you are after has been successfully implemented and can be done very easily. Good luck!

Plus: one could make a nice fine-tune dataset web GUI that interfaces with a database server, lets you paste in your example chat’s Q&A and more turns and attach a system prompt then ā€œnext pageā€, bulk import sets of questions or answers, provide custom instruction prompts to infer synthetic questions, replace separators and stops, do embeddings to assign topics or genres, then specific identities for knowledge domains, export and validate to jsonl with costs estimated, …

1 Like

Definitely would love a reference to an example