I was looking over the platform of automated content creation by copy.ai and I found out that they are pretty fast and provide up to 7 examples at the same time.
How do you think they achieve it?
Do they call the APIs 7 times for one output?
This basically means that the amount of tokens used if i call the API 3 times( n being =1) will be equal to the amount tokens used if I call the API 1 time ( n being = 3)?
Yes, but you will eat up more network resources doing it like that, since every call has an overhead. It will also probably be faster to allow them to generate multiple versions, even if you parallelise.
Multiple calls costs more tokens. Don’t forget in the first case you have 3 * (in_tokens + out_tokens), and second case you have in_tokens + 3 * out_tokens.