Generating Text using a pre-defined project

obardiya · May 21, 2022, 1:51am

Hello,

I’m currently working on a project to generate text by using a pre-defined heuristic on set of messages.
The data was initially in the form of a xlsx file, its most important features are -

MessageType: An original message refers to a message that needs to be heuristicized. A heuristicized message refers to a message that has been changed from original upon applying a particular heuristic to it.

HeuristicName: The name of the heuristic applied to the original message.

ManualMessageScore: A score from 1-3 that describes the quality of the heuristicized message, 1 being highest and 3 being lowest. This is set to NaN for an original message.

After preprocessing, I matched heuristicized messages with their respective original ones and created a dataframe that has 2 columns - a “prompt” column that consists of the original messages, and a “completion” column that consists of their heuristicized counterparts.
I then filtered the dataframe so that all heuristicized messages had been altered according a single specific heuristic.

In this case, there will be repetition since an original message may have several heuristicized messages and vice versa.

I had a couple of questions regarding the problem:

Is there a feasible solution to this problem using GPT-3? What model would be recommended the most?
I understand that the data must be in a JSONL format in order to fit it into a fine-tuned model. I’m having trouble with doing this, any help would be appreciated.
Is there a way to incorporate the features from the initial dataset into a fine-tuned model? I’ve only seen examples that incorporate the ‘prompt-completion’ structure and was wondering if this functionality exists in the model.

Let me know if i can make the problem statement clearer in any way, thanks in advance!

Topic		Replies	Views
Using GPT-3 to generate text according to a pre-defined heuristic API	1	685	May 31, 2022
Fine-tune unstructured text data and create summary table: Prompt - Completion Format Prompting	3	1807	December 27, 2022
Looking for advice on prompt engineering + API setup Community project , api	2	154	December 9, 2024
Generating dataset of prompt-completion pairs for fine-tuning Prompting	0	1669	February 20, 2023
Prepare data for fine-tuning a story generator Prompting fine-tuning	4	2067	October 20, 2023

Generating Text using a pre-defined project

Related topics