Will I be charged for failed ChatCompletion requests?

I wrote a script to submit a prompt for a set of scientific abstracts. In this script, I handle RateLimitErrors and APIErrors with a try/except statement, resubmitting the request and printing the number of times the same prompt has been tried.

Normally, it only needs to try once or twice when there’s an error on OpenAI’s end, so this method has been working fine.

However, I submitted a job to run overnight, and in the middle of the night, hit my monthly usage limit. Since I didn’t see the email when it happened, the same prompt request has been submitted over 40,000 times ( :face_with_spiral_eyes: I know, I’m dying a little inside that I didn’t forsee this being a problem) since I hit the limit.

I requested a usage limit increase, but I’m concerned that I’ll be charged for all of these failed requests, and will immediately hit my new cap.

Does anyone know if I’ll be charged for those failed requests? I haven’t been able to find a good answer.

Thanks!

1 Like

Yes, you will be charged for failed fetches. I have been contacting customer support about this for months without response. Currently I am receiving failed fetches across all my computers via GPT 4 - due to Cloudflare aggressively blocking API access. I am still charged once a month automatically whatever the account/usage says what used.

Thank you so much for the update, I also tried contacting customer service about an unrelated issue two months ago and still haven’t heard back, even though their response time says it’s 7 days… I’m sorry you’re being stuck with those charges!! Let me know if you ever hear back from them with a fix.

I still haven’t heard back and that is via both email support and their chat support - which their email support auto redirects to. What I have done is started phasing out my use of OpenAI and gradually relying on more open-source alternatives (though nothing yet comes close to GPT4 in applications that I need it - i.e. taking large texts, and extracting assessments and analytical data from them. Even their web version is insufficient for this task).

In the same light I’d be interested to know if they ever get back to you about this issue. I honestly think that reducing our reliance on an organization which fails to provide adequate customer service to paying customers is the best bet - or accepting you could be charged excessive amounts without recourse.

I was going to ask – what are the open source tools you’ve been using? I’ve been having the same thoughts – the fact that my other RateLimitErrors (the ever mysterious “the model is busy with other requests right now” being principal among them) didn’t go down after I started paying was extremely frustrating, beyond the complete lack of customer service. I’d love to know what open source alternatives you’ve been trying!

I’ve tried almost all of the popular models under 30b (in the 13b-6b ranges), and am preparing to test another model today or tomorrow called PrivateGPT. I hear 30b is like night and day to the 13b models. Below is just a small percentage of the ones I tried and took notes on.

Web UI/Interface:

  • Oobabooga (which was difficult to set up) + SillyTavern.ai for my customized “pseudo fine-tuned” AI assistants - Kobold AI and LLama.cpp are OK but run on CPU and are slow, it will be error after error getting Oobabooga to work and run on GPU because of the bitsandbytes library and different models having different types and run requirements. But worth it once set-up.
  • Prior to oobabooga I had to manually set-up and run every new model in CMD - never doing that again!

Current Fave models as of today (Uncensored)

  • thebloke/WizardLM-Vicuna-13B-Uncensored-GPTQ-4bit-128g - GPT 3.5 like quality, but token-size is limited (2k), I can’t give it a page and have it analyze and summarize it, but it analyzes paragraphs well. It also has trouble with follow-ups, and I usually have to ask my question at the beginning or end or again for it to understand. Not sure if that is my WEBUI or it though.
  • ?/WizardLM-13B-Uncensored-4bit-128g - I don’t really notice much of a difference from the WizardLM-Vicuna model, I use the WizardLM-Vicuna model daily.
  • mayaeary/Pygmalian-6b-4bit-128g - not great at all for instructions, there is a 7b version but I haven’t gotten the temp. right with the 7b version and I find for chatting this one is still the best one, esp. for brainstorming dialogue ideas.

Excited for Models

  • mosaicm/MPT-7b-Instruct + storywriter ((MPT)) - Semi-censored - 65k tokens - it is up and running on their huggingface page so you can test it yourself. => Waiting for Oobabooga to support it without having to hack it to get it to run on my computer. Really excited for this one, esp. if they are ever able to merge it with WizardLM-Vicuna and/or Dolly.
  • Redpajama - is working on an open-source version of the Llama models and that will be a game changer.

Models too big to run on my computer but are decent:

  • HuggingChat - uses 30b models from Open Assistant and a tech version AI. The Open Assistant model really had trouble understanding my texts, but it was good /decent at generating ideas and brainstorming things.

Models I like but rarely use (Uncensored):

  • databricks/dolly-v2-7b ((GPTNEO))
  • anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g (LLAMA)
  • gozfarb/pygmalian-7b-4bit-128g ((LLAMA)) - REDPAJAMA (“Future”)
  • Thebloke/WizardLM-7B-uncensored-GPTQ-4bit-128g ((LLAMA))

Models I have mixed feelings about:

  • GPT4ALL
  • (and a few others that aren’t mentioned)

Liked but deleted to make room (Censored):

  • TheBloke/Stable Vicuna 13B - ((LLAMA))
  • anon8231489123/Vicuna13B ((LLAMA))
  • ZPN/Llama 7b ((LLAMA))
  • Thebloke/WizardLM-7B-GPTQ

Testing

  • imartinez/privateGPT(based on GPT4all ) (just learned about it a day or two ago)
  • Thebloke/wizard mega 13b GPTQ (just learned about it today, released yesterday)

Curious about

  • OpenAI also announced they are releasing an open-source model that won’t be as good as GPT 4, but might* be somewhere around GPT 3.5 – my guess is it will be incredibly dumbed down. Better than GPT 2 but equal to or less than the quality of GPT 3.5. Every week, every month a new open-source technology comes out with some new breakthrough, and I really hope that pattern continues.