Prompting GPT3.5 for NER data labeling

Ok. If you don’t have a closed-ended list of terms that is fine, too. The model should still pick up the overall pattern.

You should include examples of NSFW terms in your training set for the model to understand how to treat these.

In terms of JSON, yes you can instruct the model via fine-tuning to respond in a desired JSON format. Again here, tried and tested and works very well. I agree that in a non-finetuned setting, GPT-4 is inherently better at this but you can definitely get consistent JSON results with a finetuned GPT 3.5.

Finally, ensure your system prompt is specific. If you are for instance worried about the volume of words for a given category, then simply include restrictions in your system prompt in this regard (i.e. no more than X).

1 Like