Currently Assistant API token usage is approximately (input+output)*instruction, which results in extremely high cost.
It should be input+output+instruction.
Does it continue to input instructions in between streaming token output?
1 Like