Exceeding the token limit even though i should be well below?

shield · March 10, 2023, 1:16pm

I’m trying to pick out 10 keywords for each job application in a csv-file, but it says im exceeding the token limit, and that the output would take 2000 for completion even though its just a 10 word list. Any help with this would be appreciated! I’ll print my code below.

import openai
from nltk.corpus import stopwords
import pandas as pd
import nltk
import time

Load the CSV file into a Pandas dataframe

df = pd.read_csv(‘job_applications.csv’)
openai.api_key = “■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■PjunbXWe”
#job_records = df[‘Description’]

stop_words = set(stopwords.words(‘swedish’))
keywords =

for index, row in df.iterrows():
words = nltk.word_tokenize(row[‘Description’])
filtered_words = [word for word in words if not word.lower() in stop_words]
filtered_text = ’ '.join(filtered_words)

response = openai.Completion.create(
    engine="text-davinci-003",
    prompt=f"Sammanställ de 10 viktigaste personliga egenskaper som efterfrågas i följande jobbannons med 1 ord var i en lista: {filtered_text}",
    temperature=0.5,
    max_tokens=2000,
    n=1,
    stop="\n", # add custom stop sequence
)

keywords.append(response.choices[0].text.strip())

# Wait for API response before moving on to the next job
time.sleep(3)

df[‘Keywords’] = keywords
df.to_csv(‘job_applications.csv’, index=False)

sps · March 10, 2023, 6:56pm

HI,

What is the length of this string? Try outputting this first and get the token count for this. It’s possible that this string may be increasing the token count in excess.

Yuu’ve already set max_tokens to 2000 for 10 words, which is also not required, try lowering it as well to around 100.

RubenS · March 10, 2023, 8:14pm

I think what @sps is implying but maybe not clear to the poster is that the limit might not get exceeded by the 10 word list that is being “completed” but rather the total token count as described the API Reference:

max_tokens
The maximum number of tokens to generate in the completion.
The token count of your prompt plus max_tokens cannot exceed the model’s context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096).

linus · March 11, 2023, 3:09pm

Hi Shield,

even if it seems uninitutive, reduce “max_tokens” to a lower value. I have made the experience that this value only “reserves” the tokens for the answer, so you will reach the limit faster with a larger input.

Best regards,
Linus

Topic		Replies	Views
Error code: 400: Max token length Bugs	5	9435	April 24, 2024
openai.error.InvalidRequestError: Token limit exceeded HOWEVER the input, prompt, and output are far below the token limit API api	5	7134	February 9, 2024
API Response does not reach token limit API gpt-35-turbo , text-davinci-003	4	1763	December 18, 2023
Struggling with max_tokens and getting responses within a given limit, please help! API chatgpt	5	18552	October 28, 2023
What is number of Tokens? API	1	1312	February 26, 2023

Exceeding the token limit even though i should be well below?

Load the CSV file into a Pandas dataframe

Related topics