2 very important questions for a prod app

Hello,

Releasing iOS app that uses GPT4 turbo vision,

And here are my remaining issues before release.

Responses still very very long , I have tried everything from limiting tokens, which every time results in a hard cut off mid sentence, to playing with the prompt using “ be concise in your response” or “ try to respond within 3 sentences or more” and other prompts but all of these results in poor response with hard cutoff, do we know what I can do to limit the AI response WITHOUT the hard cut off?

My second question is, how do I protect my systems from bad actors? Let’s say, someone that wants to loop the ai to keep running long responses and so make us pay more for the api, or simply security stuff to protect my infrastructure and responses/tokens from bad actors?

Thank you!

What some apps do is simply calculate what the cost of any user request is going to be, and then deduct it from the user’s credit. I allow users to purchase more API “credit” any time they want too, so there’s no way anyone can cost me any money because they have to pay into their account before consuming any API usage at all.

1 Like

I see,

Like fund the account first and with that buy api credits,

We use a simple $9.99 a month for user with the app

How are you dealing with the long responses generated by the model?

Have you tried utilizing models other than the OpenAI models?

I’ve never had any problems with responses getting cut off, but most of the time I’m asking questions that only take 3 to 6 paragraphs as the response, and I’m setting max_tokens to 2000ish I think.

Have you tried using the system prompt to say something like “Make your responses be 3 paragraphs or less”, for example? I noticed you said “3 sentences or more”, and the “or more” part may be confusing it.