ASR Output Punctuation and Summarization

Hello, I am working on a project and I would like clarification whether I can go ahead with this idea in accordance with OpenAI’s usage guidelines.

I am developing an application to transcribe and summarize conversation transcripts, like business meetings, university lectures, podcasts, etc. Following ASR output, the raw transcript would need to be punctuated/enhanced, and depending on the specific use case, the user can choose to generate a meeting summary/report, meeting minutes, or study notes.

However, as per the guidelines, “Summarizers that end-users can submit any content they wish to are generally not approved. We sometimes approve summarizers with a maximum of a paragraph input (150 tokens)…”

First I would ask if there are any issues using GPT-3 to punctuate (or optimise) any general ASR output (as long as it passes the content filter) to make it semantically correct.

Second, given the guidelines quoted above, a summarizer for this purpose would not be accepted. However, if following the content filter, would an additional layer to determine the specific use case - that is, a classifier to indicate whether the transcription is indeed a university lecture or business meeting - be acceptable? And would this be allowed for longer conversations (a larger number of tokens, since conversations would exceed 150)?

I would appreciate any pointers or indications in the right direction. Many thanks in advance!

2 Likes