To quote from the latest OpenAI fine-tuning guidance
In general, if you have to make a trade-off, a smaller amount of high-quality data is generally more effective than a larger amount of low-quality data
They’ve recently expanded the guidance and articulated some more detailed considerations for data quality and quantity that you may find helpful for your use case.