Hi all!
I am facing a problem with the API call with structured outputs in my application. I am using it a several location in my application but the very last one - which creates a payload for a WhatsApp API request - is sometimes failing due to length limit…the weird thing is…it is not even near the limit of tokens…and sometimes the same input gets passed…the issue is also really hard to re-produce as its really coming very few times…but still blocks us to go on production
Can someone tell me why this one failing sometimes?
Here is the reference:
LengthFinishReasonError('Could not parse response content as the length limit was reached - CompletionUsage(completion_tokens=16384, prompt_tokens=1310, total_tokens=17694, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))')
Traceback (most recent call last):
File "/app/src/core/openai_chat_completion.py", line 188, in openai_chat_completion_structured
response = openai_client.beta.chat.completions.parse(
File "/usr/local/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 160, in parse
return self._post(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1283, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 960, in request
return self._request(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1066, in _request
return self._process_response(
File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1165, in _process_response
return api_response.parse()
File "/usr/local/lib/python3.10/site-packages/openai/_response.py", line 325, in parse
parsed = self._options.post_parser(parsed)
File "/usr/local/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 154, in parser
return _parse_chat_completion(
File "/usr/local/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 72, in parse_chat_completion
raise LengthFinishReasonError(completion=chat_completion)
openai.LengthFinishReasonError: Could not parse response content as the length limit was reached - CompletionUsage(completion_tokens=16384, prompt_tokens=1310, total_tokens=17694, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))
response_instructions: |
# Goal
Generate an output based on the Pydantic Model to create WhatsApp payloads (including text, images and videos) for the WhatsApp API.
# Rules
- Strictly use the content provided in the input without adding or removing any information.
- Maintain the order of the messages exactly as provided.
- Preserve any emojis/icons that are part of the input.
- Keep the message structure as given (e.g., if there is an introduction and an outro, both should be included in the output).
- Group relevant information in one message where possible. For example, if asking for property requirements to create a filter, put all relevant information in the same message.
- If an image URL and a caption/description are provided, treat it as an "image" type and us the description as the description/caption of the image. **Do not include it as type text**.
- If a video URL and a caption/description are provided, treat it as an "video" type and us the description as the description/caption of the video. **Do not include it as type text**.
**Output Rules for Image:**
- Strictly ensure that if a image_url is provided that you send that content as a message with the type "image"!!!
- Overtake the image_url of the input for the output "image_url"
- Take the description exactly as provided from the input fields, such as "property_description" and use it as the "image_description."
- Strictly ensure that if a image_url is provided that you send that content as a message with the type "image"!!!
**Output Rules for Video:**
- Strictly ensure that if a video_url is provided that you send that content as a message with the type "video"!!!
# Key "type"
- Use the type "video" for the key "type"
# Key "video_url"
- Overtake the "video_url" of the input for the output "video_url"
Never put the URL in the description! Always put it as value of the key "video_url"
# key "description"
- Take the "community_information" and "community_response_message" in the description of the video.
Put a paragraph between (two lines) the content of the "community_information" and the "community_response_message". The seperation should be clear and easy to understand for better readability.
# WhatsApp-specific Rules
- WhatsApp formats bold text with *<word>*. Therefore, if the input has **<word>**, convert it to *<word>* so that it appears correctly as bold in WhatsApp. Avoid leaving both ** and * around the text.
- Do not use # for headlines. Instead, transform them into bold text.
- Do not use a "-" for listing items if an emoji/icon is also provided.
- If "-" is used for listing, replace it with "•".
Example for thw WhatsApp-specific rules:
Input: "### 5. **The Springs**
- 🌊 *Lakes and Parks*: Beautiful lakes and landscaped gardens.
- 🏊 *Amenities*: Community pools and sports facilities.
- 🏫 *Schools*: Close to schools and nurseries."
Output: "*5. The Springs*
🌊 *Lakes and Parks*: Beautiful lakes and landscaped gardens.
🏊 *Amenities*: Community pools and sports facilities.
🏫 *Schools*: Close to schools and nurseries."
# Instructions
- Use the Pydantic Model to structure the output.
- If a weblink/hyperlink is provided in the input also keep the weblink/hyperlink in the output. This is for example important to provide the link to book a meeting with the agent.
# Output
- If **"Fallback mode status" is true**, process the **"Fallback message"** instead of the main input.
user_input: |-
Let's find the perfect project for you! To get started, could you please share a bit more about your preferences? Here are some questions to guide us:
• *Property Type*: Are you interested in a villa, townhouse, or apartment?
• *Budget*: What is your budget range for the property?
• *Bedrooms*: How many bedrooms do you need?
Once I have this information, I can recommend some exciting projects that suit your needs! 😊
PydanticModel: <class 'models.whatsapp_model.WhatsAppPayload'>
temperature: 0
model: gpt-4o-mini
frequency_penalty: 0
presence_penalty: 0
timeout: 180
@traceable(run_type="llm", name="openai_chat_completion_structured")
def openai_chat_completion_structured(
response_instructions: str,
user_input: str,
PydanticModel: Any,
temperature: float = 0.0,
model="gpt-4o-mini",
frequency_penalty=0.0,
presence_penalty=0.0,
timeout=180,
**kwargs,
) -> Union[Any, None]:
"""
Function to generate a structured response using GPT-4o-mini and validate it using a Pydantic model.
Args:
- response_instructions: The system-level instructions for generating the response.
- user_input: The user-specific input with all necessary information formatted.
- PydanticModel: The Pydantic model to be used for response validation.
- temperature: The temperature for the completion model.
Returns:
- The generated and validated message response as a Pydantic model instance.
- None: If an error occurs or retries are exhausted.
"""
try:
# Log detailed information about the chat completion request
log_chat_completion_details(
system_prompt=response_instructions,
user_input=user_input,
model=kwargs.get("model", "default_model"),
temperature=temperature,
additional_params={
"response_model": PydanticModel.__name__,
**{k: v for k, v in kwargs.items() if k != "model"},
},
)
logger.info("Initiating chat completion request")
# Input validation
if not isinstance(response_instructions, str) or not response_instructions:
logger.error("Invalid or missing response instructions.")
return None
if not isinstance(user_input, str) or not user_input:
logger.error("Invalid or missing user_input.")
return None
if not isinstance(PydanticModel, type) or not issubclass(
PydanticModel, BaseModel
):
logger.error(
"Invalid Pydantic model provided. It must be a subclass of BaseModel."
)
return None
# Count tokens (but don't enforce limits)
instruction_tokens, input_tokens = count_token_usage(
response_instructions, user_input, model
)
max_retries = 3
attempt = 0
start_time = time.time()
while attempt < max_retries:
try:
# Check for timeout
if time.time() - start_time > timeout:
logger.error("Request timed out")
return {"error": "timeout", "message": "Request timed out"}
# Making the API call to OpenAI
response = openai_client.beta.chat.completions.parse(
model=model,
messages=[
{
"role": "system",
"content": f"{response_instructions}\nProvide your response in JSON format.",
},
{"role": "user", "content": user_input},
],
temperature=temperature,
response_format=PydanticModel,
)
# Log the token usage
if hasattr(response, "usage"):
logger.info(
f"Token usage - Instructions: {response.usage.prompt_tokens}, "
f"Response: {response.usage.completion_tokens}, "
f"Total: {response.usage.total_tokens}"
)
# Process the response...
response_content = response.choices[0].message.content
# Log the raw response before parsing
try:
# Convert response_content to dict if it's a string
if isinstance(response_content, str):
import json
response_dict = json.loads(response_content)
else:
response_dict = response_content
# Create a multi-line string representation
flat_response = "\n".join([f"{k}: {v}" for k, v in response_dict.items()])
console.print(
Panel(
flat_response,
title="[green]Raw Response[/green]",
border_style="green",
padding=(1, 2),
)
)
except Exception as e:
logger.warning(f"Failed to format response for logging: {e}")
pass
return PydanticModel.parse_raw(response_content)
except RateLimitError as e:
logger.warning(f"Rate limit exceeded: {str(e)}")
# Implement exponential backoff
wait_time = (2**attempt) + random.uniform(0, 1)
time.sleep(wait_time)
attempt += 1
continue
except APITimeoutError as e:
logger.error(f"Request timed out: {str(e)}")
return {
"error": "timeout",
"message": "The request timed out. Please try again.",
}
except APIConnectionError as e:
logger.error(f"Connection error: {str(e)}")
return {
"error": "connection",
"message": "Failed to connect to the API. Please check your network connection.",
}
except AuthenticationError as e:
logger.error(f"Authentication error: {str(e)}")
return {
"error": "auth",
"message": "Authentication failed. Please check your API key.",
}
except BadRequestError as e:
logger.error(f"Bad request error: {str(e)}")
return {
"error": "bad_request",
"message": "The request was malformed. Please check your inputs.",
}
except PermissionDeniedError as e:
logger.error(f"Permission denied: {str(e)}")
return {
"error": "permission",
"message": "You don't have permission to access this resource.",
}
except InternalServerError as e:
logger.error(f"OpenAI server error: {str(e)}")
attempt += 1
if attempt < max_retries:
time.sleep(2**attempt) # Exponential backoff
continue
return {
"error": "server",
"message": "OpenAI servers are experiencing issues. Please try again later.",
}
except APIError as e:
logger.error(f"API error: {str(e)}")
if "length limit was reached" in str(e).lower():
return {
"error": "token_limit_exceeded",
"message": "The input is too large to process. Please break down your request into smaller parts.",
}
attempt += 1
if attempt >= max_retries:
return {"error": "api", "message": str(e)}
return {
"error": "max_retries",
"message": "Maximum retry attempts reached. Please try again later.",
}
except Exception as e:
logger.error(f"Error in chat completion: {str(e)}", exc_info=True)
raise
from pydantic import BaseModel, Field
from typing import List, Optional
"""WhatsApp Front-End Structure"""
#################################################################################
# Helper Models - Text, Image and Video models
class WhatsAppImage(BaseModel):
image_url: Optional[str] = Field(None, description="URL of the image.")
description: Optional[str] = Field(
..., description="Caption/description of the image."
)
class WhatsAppVideo(BaseModel):
video_url: Optional[str] = Field(None, description="URL of the video.")
description: Optional[str] = Field(
..., description="Caption/description of the video."
)
class WhatsAppMessage(BaseModel):
type: str = Field(
..., description="The type of the message, either 'text', 'image' or 'video'."
)
text: Optional[str] = Field(
...,
description="The textual content of the message, applicable if the type is 'text'.",
)
image: Optional[WhatsAppImage] = Field(
...,
description="The image object containing image URL and description, applicable if the type is 'image'.",
)
video: Optional[WhatsAppVideo] = Field(
...,
description="The video object containing video URL and description, applicable if the type is 'video'.",
)
#################################################################################
# Main reference model for the WhatsApp payload
class WhatsAppPayload(BaseModel):
messages: List[WhatsAppMessage] = Field(
..., description="The structure containing the list of WhatsApp messages."
)
And here is the call after i told the AI to retry the query over the whatsapp chat it worked…
response_instructions: |
# Goal
Generate an output based on the Pydantic Model to create WhatsApp payloads (including text, images and videos) for the WhatsApp API.
# Rules
- Strictly use the content provided in the input without adding or removing any information.
- Maintain the order of the messages exactly as provided.
- Preserve any emojis/icons that are part of the input.
- Keep the message structure as given (e.g., if there is an introduction and an outro, both should be included in the output).
- Group relevant information in one message where possible. For example, if asking for property requirements to create a filter, put all relevant information in the same message.
- If an image URL and a caption/description are provided, treat it as an "image" type and us the description as the description/caption of the image. **Do not include it as type text**.
- If a video URL and a caption/description are provided, treat it as an "video" type and us the description as the description/caption of the video. **Do not include it as type text**.
**Output Rules for Image:**
- Strictly ensure that if a image_url is provided that you send that content as a message with the type "image"!!!
- Overtake the image_url of the input for the output "image_url"
- Take the description exactly as provided from the input fields, such as "property_description" and use it as the "image_description."
- Strictly ensure that if a image_url is provided that you send that content as a message with the type "image"!!!
**Output Rules for Video:**
- Strictly ensure that if a video_url is provided that you send that content as a message with the type "video"!!!
# Key "type"
- Use the type "video" for the key "type"
# Key "video_url"
- Overtake the "video_url" of the input for the output "video_url"
Never put the URL in the description! Always put it as value of the key "video_url"
# key "description"
- Take the "community_information" and "community_response_message" in the description of the video.
Put a paragraph between (two lines) the content of the "community_information" and the "community_response_message". The seperation should be clear and easy to understand for better readability.
# WhatsApp-specific Rules
- WhatsApp formats bold text with *<word>*. Therefore, if the input has **<word>**, convert it to *<word>* so that it appears correctly as bold in WhatsApp. Avoid leaving both ** and * around the text.
- Do not use # for headlines. Instead, transform them into bold text.
- Do not use a "-" for listing items if an emoji/icon is also provided.
- If "-" is used for listing, replace it with "•".
Example for thw WhatsApp-specific rules:
Input: "### 5. **The Springs**
- 🌊 *Lakes and Parks*: Beautiful lakes and landscaped gardens.
- 🏊 *Amenities*: Community pools and sports facilities.
- 🏫 *Schools*: Close to schools and nurseries."
Output: "*5. The Springs*
🌊 *Lakes and Parks*: Beautiful lakes and landscaped gardens.
🏊 *Amenities*: Community pools and sports facilities.
🏫 *Schools*: Close to schools and nurseries."
# Instructions
- Use the Pydantic Model to structure the output.
- If a weblink/hyperlink is provided in the input also keep the weblink/hyperlink in the output. This is for example important to provide the link to book a meeting with the agent.
# Output
- If **"Fallback mode status" is true**, process the **"Fallback message"** instead of the main input.
user_input: |-
Thank you for your patience! Unfortunately, I couldn't find any properties that fully match your criteria for villas at the moment.
Could you please clarify which features are most important for you? Here are some aspects to consider:
• *Location*: Do you have a specific area in mind?
• *Amenities*: Are there any specific amenities you want, like a gym, pool, or park?
• *Budget*: Would you like to adjust your budget range?
Your input will help me refine the search and find the best options for you! 😊
PydanticModel: <class 'models.whatsapp_model.WhatsAppPayload'>
temperature: 0
model: gpt-4o-mini
frequency_penalty: 0
presence_penalty: 0
timeout: 180
Output:
output:
messages:
- type: text
text: Thank you for your patience! Unfortunately, I couldn't find any properties that fully match your criteria for villas at the moment.
- type: text
text: "Could you please clarify which features are most important for you? Here are some aspects to consider:"
- type: text
text: |-
• *Location*: Do you have a specific area in mind?
• *Amenities*: Are there any specific amenities you want, like a gym, pool, or park?
• *Budget*: Would you like to adjust your budget range?
Your input will help me refine the search and find the best options for you! 😊