Exceeding Token Limit and Chat Menu Ordering - Seeking Recommendations

Hello everyone,

I hope you’re all doing well. I wanted to share some updates regarding our restaurant menu ordering chat company. We’ve been working diligently to improve our services and enhance the user experience, but we’re currently facing a challenge that we could use your valuable insights on. Restaurant menus can vary in size and as a menu becomes larger, so are the amount of tokens it requires to successfully place an order.

One of the issues we’ve encountered is that our chat conversations are frequently exceeding token limits. This is causing interruptions in the chat experience, impacting the speed and efficiency of ordering food from our menu. We are regularly hitting 5000 or so tokens per order, which considering the costs of GPT 4, at scale this can add up really fast. Additionally, we have the issue of scale. If there are say for example three orders happening simultaneously, this will essentially cap our limit and users would not be able to order. We looked into Langchain and caching but the issue here is rarely do we have the same phrase as an answer to a customer’s order.

What We’re Looking For:

  1. Any insights or best practices for managing and optimizing token usage in chat systems.
  2. Recommendations for third-party tools or resources that could assist in addressing token limit issues.

Thank you in advance for your time and assistance. Looking forward to your suggestions and insights!

Best regards,

Anthony

1 Like