The system we’re designing creates many small text-davinci calls. The overall speed of the system is limited by this response time. Would it be more optimal to batch these into a single query or will these response times still remain high?
As far as I know, batching multiple small text-based queries into a single query helps minimize overhead and allows for parallel processing. However, essay when you manage the batch size, consider the trade-off between reduced overhead and increased latency.