Help ! please 😭 I Searched the entire Documentation "GPT API"

I have tried all API’s from the first one to last one :sob:
, have tried all the ways for prompt engineering :sob: :sob:
, searched the web
, try to fine tune the model several time
and still git a very bad response

the main idea is the long respond ,min is 4000 token and there is no max , if more token is respond then its better

i made an API to make a long course content based on user answers and subtopics he provided to the model ,the model must return the course content in a specific format , all that is good , but when i test the model on English , its work as expected , but when the user choose Arabic language , the model respond with 1900 token or 2300 token only , i have trued all the ways
and cant figure out how to solve this issue .
i want the model to respond with at least 3000 to 4000 token

The code

def core_app_fetch_data_from_openai(self, content, response_tokens=None):
    openai.api_key = settings.OPENAI_API_KEY
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": content}
        ],
    )
    return response



                    prompt= (
                    f"### Task: Develop content for a course titled '{course_name}', aimed at '{targeted_people}' at a '{targeted_people_level}' level.\n"
                    f"### The content should be in about 4000Words.\n"
                    "### Content Development Instructions:\n"
                    f"- Focus on the following subtopics: {subtopics_text_value}\n\n"
                    
                    f"- For each subtopic, aim to provide the following in {course_language} language:\n"
                    "   - Clear definitions of key terms and concepts, depth prioritized where needed.\n"
                    "   - Detailed explanations to ensure understanding, focusing on the most relevant information within constraints.\n"
                    "   - At least one relevant example for each concept, with priority given to illustrating complex or important topics.\n\n"
                    
                    "Start each topic with 'Slide' followed by a sequential number starting with 1 then a : then followed by the subtopic slide name.\n\n"
                    
                    "Balance coverage to emphasize the most critical aspects for the targeted audience. Conclude the content with a 'References' section Start with 'Slide', listing all sources of information used to develop the content.\n\n"
                    )

course_content = core_app_fetch_data_from_openai(self, create_lab_content)['choices'][0]['message']['content'].strip()


Welcome to the community!

Have you seen this? https://platform.openai.com/docs/models/gpt-3-5-turbo

most models have a MAX of 4,096 output tokens.

These new models have been engineered to provide short answers. You’re been fighting an uphill battle all along, and it turns out there’s no winning move for you.

A naive solution/option you have (apart from trying to use older, more expensive models) is to split your task up:

Maybe generate a table of contents in one run, and then use subsequent runs to generate each section independently? :thinking:

5 Likes

I would also recommend this:

You can take a look at the following post if you’d like an example of how this can be done :laughing:

2 Likes

Thanks for your tips :blush: :heart_hands:
yes I have seen https://platform.openai.com/docs/models/gpt-3-5-turbo

and tried the model with 16K output : gpt-3.5-turbo-16k
and didn’t work for me :broken_heart:

,also tried to split the contents but didn’t like the response ,

am searching for a solution to work for me like am using the English Language

Maybe am coding wrong ?? or there is more steps or instruction i need to do to the Prompt ??

IDK :thinking: :no_mouth:

here is the code ā€œsplit the contentsā€:

                    for index, group in enumerate(subtopics_groups):
                        # Print statement indicating which prompt is being processed
                        print(f"Processing prompt {index + 1} of {len(subtopics_groups)}...")
                        print(f"group number{index + 1} content: {group}")
                        #  i want a method to know the biggest number in the content after the word slide example :   , slide 8 , so the method will add one to it then return the number 
                        # x = self.find_biggest_slide(course_content_part_1)
                        next_slide_start = self.find_biggest_slide(aggregated_course_content)
                        # Update the prompt with the current group of subtopics
                        
                        create_lab_content = (
                            f"### Task: Develop content for a course titled <{course_name}>, aimed at <{targeted_people}> at a <{targeted_people_level}> level.\n"
                            f"### The content should be in about <{number_of_tokens}> Words.\n"
                            "### Content Development Instructions:\n"
                            f"- Focus on the following subtopics: <{group}>\n\n"
                            
                            f"- For each subtopic, aim to provide the following in <{course_language}> language:\n"
                            "   - Clear definitions of key terms and concepts, depth prioritized where needed.\n"
                            "   - Detailed explanations to ensure understanding, focusing on the most relevant information within constraints.\n"
                            "   - At least one relevant example for each concept, with priority given to illustrating complex or important topics.\n\n"
                            
                            f"Start each topic with 'Slide' followed by a sequential number starting with {next_slide_start}, followed by the subtopic slide name.\n\n"
                            
                            "Balance coverage to emphasize the most critical aspects for the targeted audience. Conclude the content with a 'References' section Start with 'Slide', listing all sources of information used to develop the content.\n\n"
                        )


                        try:
                            # Fetch data for the current group
                            try:
                                course_content = core_app_fetch_data_from_openai(self, create_lab_content)['choices'][0]['message']['content'].strip()
                #                course_content = " "
                            except Exception as e:
                                print(f"Error when fetching course_content: {e}")

                           

                            aggregated_course_content += course_content + "\n\n"  # Aggregate content from each group

What do you get if you print out create_lab_content?

does the constructed prompt still make logical sense?

1 Like

also, just for your information again: gpt-3.5-turbo-16k also does not have a 16k output. Just like for the other models, the output would be constrained to 4k token at the maximum. The 16k refers to the total context window which includes both the input and the output tokens. :grimacing:

1 Like

Another quick separate thought. You said earlier you were happy with the output in English.

If you can’t obtain the desired output in Arabic, then you could just use the model to produce the output in English and then add a step to translate the English result into Arabic using either a GPT model or a translation API. It would obviously be nicer to just get the desired output in one step but that might be another workaround to consider.

2 Likes

for the code of ā€œsplit the contentsā€ it works , but not that good ,

if i print out the create_lab_content i got the answer about 2000 token no more

or did you meant something else , could you explain more what do you mean ?

so what is this ?

are you sure ??? :smiling_face_with_tear: :broken_heart: :broken_heart:

thanks for your tips :pray: :pray: :heart_hands:

yes I have tried this also , generated the content in English then translate it to Arabic and it takes like 8 min to finish the whole circle :broken_heart: so I :smiley:canceled all the idea

It seems to be a bug in the way that max length is currently displayed in the playground. We are seeing this for a few other models as well. For details on context window and maximum output tokens it is best to go with the model documentation (same link as above):

2 Likes

oh okay :no_mouth:, thanks for that appreciate it :pray: , I was so confused about it

1 Like

You might want to try with a translation API instead. They are quicker and also cheaper. You’d have to run some tests to ensure that the quality is still appropriate. The one I have been relying on is this one from Azure:

https://azure.microsoft.com/en-us/pricing/details/cognitive-services/translator/

I just tested it for Arabic and a 1,800 word translation from English to Arabic took less than 2 seconds.

1 Like

Thanks again for that :pray: :pray: :heart_hands: :blush:

1 Like

Probably, the content amount is proportional to the number of slides, so this will be enough.

In my case, by making the following changes, I received content that was sufficiently long, without any breaks in the text in Arabic, and also included the ā€˜References section’ at the end (the References section is in English, though).

Start each topic with ā€˜Slide’ followed by a sequential number starting with 1 then a : then followed by the subtopic slide name.

to

Start each topic with ā€˜Slide:1’ followed by the subtopic slide name, and continue as long as possible.

2 Likes

Just to clarify, my above reply was meant as an answer to the question posed by the OP.

1 Like

It appears to be a bug in the information you provided.

gpt-3.5-turbo-16k is -0613 and its output is not artificially limited with a maximum max_tokens. The same with gpt-4 (8k) and gpt-4-32k (if one has access). It is only the model training that still will get you unsatisfactory output instead of your novel.

The quality of the 3.5 model fades when using the maximum context, though. For example, asking for a rewrite sentence-by-sentence, without skipping any sentences, on 8k of input, will get you the input repeated back with no improvements.

?

I likely misunderstand the message. I was referring to the playground where there currently seem to be some issues regarding max length. I don’t use the playground frequently but after a few posts this week I checked and noticed that there are some discrepancies. In some cases it seems to refer to output token length, in other cases to context window etc.

For example:

  1. Max length for GPT-4 Turbo shows as 4K to me
  2. Max length for GPT-3.5-Turbo-16K shows as 16k to me
  3. Max length for my fine-tuned models shows as 2k

Isn’t this outdated/wrong:
response = openai.ChatCompletion.create(

And this is correct:
response = client.chat.completions.create(

Maybe not what you are looking for…

1 Like

thanks for your respond , but i think this is just for request , there is old one and a new one and there is no difference