Optimum model for tagging articles?

Which model and method offer the best cost efficiency / quality for tagging articles as follows… ?

  • Article: varying length, average 600 words
  • Tags: AI should choose from pre-defined list of around 300 entities - potentially 550 words, 6,000 chars. Multiple can be returned.

Using Tokenizer, it seems the list of predefined terms alone gets to around 2,300 tokens, without also including the article. So I guess the combined input could potentially exceed 4,000.

I assume the returned tags would account for very few tokens.

I’m having a brain-fart understanding whether the token limits quoted in the Playground are for combined input-output or just output.

Anyone think that text-davinci-003 can execute these tasks with quality, or is it going to require GPT-4?