So I have been experimenting with the API for a bit now and it’s been a bit of a pain for me so far. I wanted to ask you if you had similar experiences and how you resolved them.
So what am I doing?
- Upload a JSONL completions file
- Classify a text to get the label for it
The first issue I encountered was, that the server rejects long lines.
More specifically I get a 400 BAD REQUEST “Request Line is too large (4283 > 4094)” for example.
When I tried a smaller query to get results, I encountered the next error:
“The document at index 14 is 980 tokens over the length limit of 517. If you would like us to add a feature to auto-truncate server-side, let us know at support@openai.com.’”
At this point im questioning the purpose of classifications endpoints because there’s limits and limits and limits… How am I supposed to classify a mid-size document when the API is incapable of processing a) my request and b) my training data?
I also want to have multiple labels but classifications support only single lables as far as I know. I am now considering to just use elasticsearch with a more_like_this query to run my classifications…
3 Likes
How much “training data” are you including? Have you attempted to use zero-shot or few-shot inference?
Anyways, I just break things down into smaller requests. Very often, all you need is a good prompt and Curie is good enough, plus it’s faster and cheaper!
Thank you!
In my approach I did following steps:
- I trained GPT-2 on a dataset I previously prepared (in my case: variety on historical literary magazines from various Avant-Garde scenes, available publicly on Monoscope)
- (still using GPT-2) I generated new texts and selected the most interesting of them
- I used the results as Prompt or, eventually part of my prompt for GPT-3 (which in this scenario played a role of a literary critic) to analyze them.
I will later write an essay about this, so don’t want to spoil anything, but I can say: the results are very interesting.
So, short: my combination of GPT-2 + GPT-3 has rather manual approach and isn’t technically solved (but can be surely solved by some enhancements of Colab Notebook).
EDIT. Regarding GPT-3: it can freely relace GPT-2, you just have to chose an engine and appropriate settings for your approach. Say, Ada is creatively near to GPT-2 (but still has incomparably huge knowledge than GPT-2), DaVinci is indeed the most cost consuming and should be used for elaborated completions. But working with GPT-2 I indeed miss the creative power and text coherence of even Ada.
1 Like