Limits and limits and limits

So I have been experimenting with the API for a bit now and it’s been a bit of a pain for me so far. I wanted to ask you if you had similar experiences and how you resolved them.

So what am I doing?

  • Upload a JSONL completions file
  • Classify a text to get the label for it

The first issue I encountered was, that the server rejects long lines.
More specifically I get a 400 BAD REQUEST “Request Line is too large (4283 > 4094)” for example.

When I tried a smaller query to get results, I encountered the next error:
“The document at index 14 is 980 tokens over the length limit of 517. If you would like us to add a feature to auto-truncate server-side, let us know at support@openai.com.’”

At this point im questioning the purpose of classifications endpoints because there’s limits and limits and limits… How am I supposed to classify a mid-size document when the API is incapable of processing a) my request and b) my training data?

I also want to have multiple labels but classifications support only single lables as far as I know. I am now considering to just use elasticsearch with a more_like_this query to run my classifications…

3 Likes

How much “training data” are you including? Have you attempted to use zero-shot or few-shot inference?

Anyways, I just break things down into smaller requests. Very often, all you need is a good prompt and Curie is good enough, plus it’s faster and cheaper!

Thank you!
In my approach I did following steps:

  1. I trained GPT-2 on a dataset I previously prepared (in my case: variety on historical literary magazines from various Avant-Garde scenes, available publicly on Monoscope)
  2. (still using GPT-2) I generated new texts and selected the most interesting of them
  3. I used the results as Prompt or, eventually part of my prompt for GPT-3 (which in this scenario played a role of a literary critic) to analyze them.

I will later write an essay about this, so don’t want to spoil anything, but I can say: the results are very interesting.

So, short: my combination of GPT-2 + GPT-3 has rather manual approach and isn’t technically solved (but can be surely solved by some enhancements of Colab Notebook).

EDIT. Regarding GPT-3: it can freely relace GPT-2, you just have to chose an engine and appropriate settings for your approach. Say, Ada is creatively near to GPT-2 (but still has incomparably huge knowledge than GPT-2), DaVinci is indeed the most cost consuming and should be used for elaborated completions. But working with GPT-2 I indeed miss the creative power and text coherence of even Ada.

1 Like