It seems that GPT-4o follows the instructions defined in the description of a field very inconsistently (or, better to say, rarely). In contrast, if I put the description of a field directly in the system prompt, it is much better respected. (I use the OpenAI Python client with Pydantic models and the response_format
option.)
For example, this works much worse:
system_prompt = "Extract information from the below medical report in JSON format:"
class Summary(BaseModel):
...
score: int = Field(description="The NIH Stroke Scale/Score (NIHSS). Add 5 to the reported score.")
...
And this works much better:
system_prompt = """
Extract information from the below medical report in JSON format using these properties:
...
- score: The NIH Stroke Scale/Score (NIHSS). Add 5 to the reported score.
...
"""
class Summary(BaseModel):
...
score: int
...
(Adding 5 only checks if it follows the instructions and has no medical background.)
Has anyone else had this experience, too?