Roughly how many concurrent requests will an instance of DV (say at 8K max context) be able to handle with response time similar to ChatGPT Plus? Let’s assume the typical context comes in at 4K tokens.
OpenAI Foundry for context: https://twitter.com/transitive_bs/status/1628118163874516992