Help with classification using chatgpt API


I am using chatgpt API for classification. I have to classify 2000 docs. Right now I am batching 10 docs per prompt and I am instructing the model to return a list of labels.

The issue I am facing is the model returns only 9 samples only in edge cases like in 2% of the cases leading to structural issues in data loading.

Can someone please let me know is there a better way to classify? I was thinking sending only 1 doc per request but that would mean a lot of latency and might hit ratelimits leading to more latency.

could you please provide the prompt with the instructions you are using?
It is also suggested to check the rate limits of the account since the time and tokens limits may interfere with the batch. Let’s see how possible improvements can be applied to the prompt and batch process.


I have created some checks that manage this. Thank you for your time anyway!

1 Like

Can you tell us what approach you took to ensure the output size is same as input size in a batch?