Fine-tuning blocked by moderation system

I literally had to butcher my dataset from 120 lines down to 15 to get it to work. This was for gpt-4o-mini-2024-07-18. Here’s the script I made to get it to work (and I still had to manually remove a few lines that mentioned politics:)

python

import json
import openai

# OpenAI API key (to be filled in)
openai.api_key = ""

# Paths for input and output files
input_file = "input_dataset.jsonl"
output_file = "approved_output.jsonl"

# Threshold for category confidence
threshold = 0.001

# Read all lines from the input file and remove duplicates
with open(input_file, 'r', encoding='utf-8') as infile:
    lines = infile.readlines()
    unique_lines = list(set(lines))  # Remove duplicate lines

# Process each unique line
with open(output_file, 'w', encoding='utf-8') as outfile:
    for line in unique_lines:
        try:
            data = json.loads(line)
            messages = data.get('messages', [])
            
            all_messages_approved = True  # Flag to track if all messages are approved

            # Submit each individual message content to the Moderation API
            for message in messages:
                content = message.get('content', '')

                if content:  # Ensure there's content to submit
                    response = openai.Moderation.create(input=content)
                    results = response["results"][0]

                    # Check if any category has a score higher than the threshold
                    for category, score in results["category_scores"].items():
                        if score > threshold:
                            all_messages_approved = False
                            break

                if not all_messages_approved:
                    break

            if all_messages_approved:
                # Only write the original line if all messages are approved
                outfile.write(line)
                
        except Exception as e:
            print(f"Error processing line: {e}")

The fact that a threshold of 0.001 is needed is INSANE. Anything above that caused the moderation error. They are clearly concerned about people abusing the fine-tuning system. Hopefully, that helps identify the issues in your datasets.