Do the Usage Guidelines allow me to write articles from podcasts?

smgplank · January 5, 2022, 3:51pm

Hi all,

I’m reaching out to ask for help clarifying if the project I am working on will fall within OpenAI’s usage guidelines before I invest more energy into it. I’ve read through the usage guidelines but still have some questions.

Here’s a brief description of what I’m working on: an application where users input the audio of a podcast that they have created and receive back a summary of the discussion, styled like an article. The tool would help podcasters produce written content from audio they’ve already created that they can easily share with audiences that like to read instead of listen.

I’ve made a fine-tuned model for doing this. Here’s a quick overview of the implementation: use another API to transcribe the audio. Batch it into chunks of ~1500 tokens. Use a fine-tuned Davinci model to translate each chunk of the discussion into text that is styled like an article and return up to 500 tokens. Then combine these to form the article summary.

There are some sections of the usage guidelines that I have questions about:

In “Article writing / editing”:

"Please limit max user input tokens to 50. Including more tokens in the prompt is okay (for instance, examples), but the tokens user input should only be 50 (~200 characters), or slightly more at maximum.”
“We generally do not permit tools that generate a paragraph or more of text, unless the output is of a very specific structure that couldn’t be re-purposed for general blog or article generation (e.g., a cover letter, a recipe, song lyrics).” in “Article writing / editing”

In “Blog intro paragraphs”

“Make sure that your tool does not generalize to writing a paragraph on any topic - only an intro-like paragraph - as we do not permit open-ended generations of this length.” in “Blog intro paragraphs”

These sections sound like I may not be allowed to use GPT3 to do the type of podcast-to-article writing that I want. However, my understanding of the spirit of those rules is to prevent replication of Playground and “the misuse of tools to generate mis- and disinformation at scale”. User text input would be small (i.e. the url of where their audio is hosted), but they would have control over what is said in the podcast they upload. In that sense, they can’t directly replicate Playground but their words will influence model output. Additionally, it would be hard for users to abuse this tool to create mis/disinfo at scale because they would need to spend a lot of time creating a brand new podcast for each new article they wanted to create. It wouldn’t scale well if they are trying to cheaply spin up a bunch of content fast.

Given that, would this type of project be acceptable if I brought it to pre-launch review? I would be building in the recommended safety filters like rate limits, filters for unsafe content, and a human-in-the-loop to review output. If not, are there any modifications I could make to keep the general idea of the project while making it compliant? Thank you in advance for any help!

sps · January 5, 2022, 6:20pm

Hi @smgplank,

From what you have shared, it sounds like a great tool to be brought into existence.

I can not comment/speculate on how the OpenAI team will react to your approval application but I wish you good luck.

PS: Here’s a blog from OpenAI

georg · January 10, 2022, 10:58am

Great project @smgplank! I’ve built something somewhat similar but with text based content. Main difference is I focus on summarising shorter content, as opposed to generating articles from longer content. That said, your use case may also fit into general summarisation.

My feedback from the OpenAi review team was:

no news articles
ensure to submit a user ID with every API call (for summarising content)
implement content filter to exclude “unsafe” content

I also have a manual review process for my OpenAi created content. Got approved right away (before access was openly available) and the volume limits I’ve asked for.

My guess is that your project, as described, dies not violate any policies and will be approved.

Hope this helps.

smgplank · January 10, 2022, 2:24pm

@sps @m-a.schenk @georg thank you all for the encouragement and context!

SecMovPuz · January 12, 2022, 8:32pm

someone should make an application reviewer using gpt-3 lol

adriano.silva · January 13, 2022, 12:50pm

This is a really interesting project you’re working on @smgplank. I believe the best way to make sure this complies with guidelines is to share the process with the openAI support team and ask for guidance, as it seems like parts of your project goes through some grey area of their guidelines.

On another subject, just a quick question as I’ve been working on an AI algorithm to transcribe audio to text myself, which service are you using for this on your project?

smgplank · January 20, 2022, 5:15am

Hi @adriano.silva, apologies for the slow reply, I just saw this. I’m using AssemblyAI to transcribe audio to text.

smgplank · January 28, 2022, 3:04am

Hey all, I want to report back that I did go through the pre-launch review process and the OpenAI team approved the use case. One important thing to mention is that in my application I discussed how I have a human-in-the-loop reviewing the output, amongst some other safety checks like the content filter and user authentication.

python.tester12 · January 29, 2022, 4:09pm

Nice ! Good luck.with your project

Topic		Replies	Views
Successful Pre-launch Review Request examples? Community	7	710	October 4, 2021
Question about Usage Guidlines Community	2	449	January 3, 2024
The Art of Fine-Tuning: How I Used GPT-3 to Bring My Podcast to Life on the Page Community	10	2297	December 15, 2023
Content Completer requirements library - are the rules same for everyone? Community	7	570	January 3, 2024
How to get a review of an app idea before it's built? Community	3	557	October 21, 2024

Do the Usage Guidelines allow me to write articles from podcasts?

Related topics