There’s a very very nascent idea that I have been toying with in my mind over the past few days. What if we could just get the model to return the boundaries of the semantic chunk, i.e. the first few and last few words that would make the chunk uniquely identifiable.
With that information you could then likely just apply a regular script to extract the actual text of the chunks. If that was possible, then a single or reduce number of API calls might be enough and thus would save time and costs.