Do the Usage Guidelines allow me to write articles from podcasts?

Hi all,

I’m reaching out to ask for help clarifying if the project I am working on will fall within OpenAI’s usage guidelines before I invest more energy into it. I’ve read through the usage guidelines but still have some questions.

Here’s a brief description of what I’m working on: an application where users input the audio of a podcast that they have created and receive back a summary of the discussion, styled like an article. The tool would help podcasters produce written content from audio they’ve already created that they can easily share with audiences that like to read instead of listen.

I’ve made a fine-tuned model for doing this. Here’s a quick overview of the implementation: use another API to transcribe the audio. Batch it into chunks of ~1500 tokens. Use a fine-tuned Davinci model to translate each chunk of the discussion into text that is styled like an article and return up to 500 tokens. Then combine these to form the article summary.

There are some sections of the usage guidelines that I have questions about:

In “Article writing / editing”:

  • "Please limit max user input tokens to 50. Including more tokens in the prompt is okay (for instance, examples), but the tokens user input should only be 50 (~200 characters), or slightly more at maximum.”
  • “We generally do not permit tools that generate a paragraph or more of text, unless the output is of a very specific structure that couldn’t be re-purposed for general blog or article generation (e.g., a cover letter, a recipe, song lyrics).” in “Article writing / editing”

In “Blog intro paragraphs”

  • “Make sure that your tool does not generalize to writing a paragraph on any topic - only an intro-like paragraph - as we do not permit open-ended generations of this length.” in “Blog intro paragraphs”

These sections sound like I may not be allowed to use GPT3 to do the type of podcast-to-article writing that I want. However, my understanding of the spirit of those rules is to prevent replication of Playground and “the misuse of tools to generate mis- and disinformation at scale”. User text input would be small (i.e. the url of where their audio is hosted), but they would have control over what is said in the podcast they upload. In that sense, they can’t directly replicate Playground but their words will influence model output. Additionally, it would be hard for users to abuse this tool to create mis/disinfo at scale because they would need to spend a lot of time creating a brand new podcast for each new article they wanted to create. It wouldn’t scale well if they are trying to cheaply spin up a bunch of content fast.

Given that, would this type of project be acceptable if I brought it to pre-launch review? I would be building in the recommended safety filters like rate limits, filters for unsafe content, and a human-in-the-loop to review output. If not, are there any modifications I could make to keep the general idea of the project while making it compliant? Thank you in advance for any help!

3 Likes

Hi @smgplank,

From what you have shared, it sounds like a great tool to be brought into existence.

I can not comment/speculate on how the OpenAI team will react to your approval application but I wish you good luck.

PS: Here’s a blog from OpenAI

3 Likes

Great project @smgplank! I’ve built something somewhat similar but with text based content. Main difference is I focus on summarising shorter content, as opposed to generating articles from longer content. That said, your use case may also fit into general summarisation.

My feedback from the OpenAi review team was:

  • no news articles
  • ensure to submit a user ID with every API call (for summarising content)
  • implement content filter to exclude “unsafe” content

I also have a manual review process for my OpenAi created content. Got approved right away (before access was openly available) and the volume limits I’ve asked for.

My guess is that your project, as described, dies not violate any policies and will be approved.

Hope this helps.

3 Likes

@sps @m-a.schenk @georg thank you all for the encouragement and context!

2 Likes

someone should make an application reviewer using gpt-3 lol

2 Likes

This is a really interesting project you’re working on @smgplank. I believe the best way to make sure this complies with guidelines is to share the process with the openAI support team and ask for guidance, as it seems like parts of your project goes through some grey area of their guidelines.

On another subject, just a quick question as I’ve been working on an AI algorithm to transcribe audio to text myself, which service are you using for this on your project?

1 Like

Hi @adriano.silva, apologies for the slow reply, I just saw this. I’m using AssemblyAI to transcribe audio to text.

1 Like

Hey all, I want to report back that I did go through the pre-launch review process and the OpenAI team approved the use case. One important thing to mention is that in my application I discussed how I have a human-in-the-loop reviewing the output, amongst some other safety checks like the content filter and user authentication.

3 Likes

Nice ! Good luck.with your project