First, I encountered lag, and then an API error occurred

And then, in five hours, the generation rate of models has reversed…nullifying a recommendation.

1024 max tokens before

Model Trials Avg Latency (s) Avg Rate (tokens/s)
gpt-4o-mini 10 0.891 43.081
gpt-4.1-mini 10 0.683 65.232

1024 max tokens now

Model Trials Avg Latency (s) Avg Rate (tokens/s)
gpt-4o-mini 10 0.951 51.417
gpt-4.1-mini 10 1.084 37.569

768 max tokens now

Model Trials Avg Latency (s) Avg Rate (tokens/s)
gpt-4o-mini 10 0.759 53.049
gpt-4.1-mini 10 0.751 39.203

If curious, this blast of calls gets any cache disrupted with an initial inserted random system message that is “{session_id} is chat session ID.”, differing not just in content but in token length. The request is not large enough to receive a cached discount. However, my prior statistical distributions found that even when not discounted, there was difference correlated to cacheability.

top_p: 0.001 reflects the desire of any developer to control sampling, where departure from default in temperature or top_p also affects performance.

011
870694
192728562
318594459065
294490027466378
292691559810764014
069141475949951893187
306070117383162570272003
831484445272805486878781564
982814277251838824698011873434
954829967379650690909346590862592
458714204821044389627179228435846126
366273607463525488234825564357321280023
298319909803501411099723503965263403745793
640397000481629968082786311860899616432119908
925792740705724365379669505448589547656777982421
516486298580211071938307766531419472360332080576986
052517981526082022117191542895851280998512358126848251
070347151877136915331394882160523063605568533412750825181
117053800018121052190908638028822232695888423787844214209839
716
555525
207421707
937159959367
646461467424236
421723921309670050
719551001757333572050
332575342619161089739050
506833403456697150009272194
827810134041012003821518653439
428385452719398099113605210539633
914397146599788341390831456558868248
528052564130914569097753680305609806241
042107896578133647294886224949703719971186
341474072152270760139004577249271343872207661
658780600599666828391154711464019280033047961091
663061316446079529370032252339957196898969340713810
325533249746091983735767660503162821555290219584335185
652093775667321088542384668336368138646710677149314991137
247616427411582439843694436142634272198624744919597770713473
1 Like