My take on the OpenAI Meeting Minutes tutorial

GEScott71 · December 6, 2023, 3:37pm

I’m posting this because it may be helpful to other beginners - I made a video for the OpenAI meeting minutes tutorial and the code I implemented based on it.

One of the functions I added is to search the transcript for ‘items of interest’ specified by the user, and summarize what was said about them. I am having issues with that working well; more to come there.

The OpenAI tuturial is here: https://platform.openai.com/docs/tutorials/meeting-minutes

Video: https://youtu.be/IWr-WZOS2fM

Code: GitHub - GEScott71/GPT_Meeting_Minutes: My take on the OpenAI meeting minutes tutorial

# This Python program uses OpenAI tools to create meeting minutes from an audio file, like company earnings calls
# First, it uses Pydub (not from OpenAI) to segment the audio file into small enough chunks for OpenAI to process
# Next, it uses Whisper from OpenAI to transcribe the audio into a text file
# Then it uses the ChatGPT API to extract the following from the transcription:
#   - Summary
#   - Key Points
#   - Action Items
#   - Sentiment
# Also, I've added to additional functions beyond the tutorial scope:
#   - Participants
#   - Mentions of items of interest specified by the user
# Last, it combines them into a single text file
# Input: mp3 audio file
# Output: 2 text files:  transcription.txt and minutes.txt

from pydub import AudioSegment
import math
import os
import openai
from openai import OpenAI
import time

openai.api_key = open("key.txt", "r").read().strip('\n')
client = OpenAI(
    api_key=openai.api_key
)


def split_mp3(file_path, segment_size_mb=25):
    """
    Splits an MP3 file into multiple segments if its size is greater than the specified segment size.

    :param file_path: Path to the MP3 file.
    :param segment_size_mb: Maximum size of each segment in MB. Default is 25MB.
    :return: A list of paths to the generated segments.
    """
    # Check if the file exists
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"The file {file_path} does not exist.")

    # Calculate the file size in MB
    file_size_mb = os.path.getsize(file_path) / (1024 * 1024)

    # If the file size is smaller than the segment size, no splitting is needed
    if file_size_mb <= segment_size_mb:
        print(f"The file is smaller than {segment_size_mb}MB, no segmentation needed.")
        return [file_path]

    # Load the audio file
    audio = AudioSegment.from_mp3(file_path)

    # Calculate the total duration in milliseconds
    total_duration_ms = len(audio)

    # Calculate the duration of each segment in milliseconds
    # We assume the bit rate of the mp3 is 128kbps for calculation
    segment_duration_ms = (segment_size_mb * 1024 * 8) / 128 * 1000

    # Calculate the number of segments needed
    num_segments = math.ceil(total_duration_ms / segment_duration_ms)

    # Split and export the segments
    segment_paths = []
    for i in range(num_segments):
        start_ms = i * segment_duration_ms
        end_ms = min((i + 1) * segment_duration_ms, total_duration_ms)
        segment = audio[start_ms:end_ms]
        segment_path = f"{file_path}_segment_{i + 1}.mp3"
        segment.export(segment_path, format="mp3")
        segment_paths.append(segment_path)
        print(f"Segment {i + 1} exported as {segment_path}.")

    return segment_paths


def transcribe_audio_list(segments):
    combined_transcription = ""
    for audio_file_path in segments:
        with open(audio_file_path, 'rb') as audio_file:
            transcription = client.audio.transcriptions.create(
                model="whisper-1",
                file=audio_file)
            combined_transcription += transcription.text + " "

    return combined_transcription


def abstract_summary_extraction(transcription):
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": "You are a highly skilled AI trained in language comprehension and summarization. I would like you to read the following transcription of a meeting and summarize it into a concise abstract paragraph. Aim to retain the most important points, providing a coherent and readable summary that could help a person understand the main points of the discussion without needing to read the entire text. Please avoid unnecessary details or tangential points."
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    # return response['choices'][0]['message']['content']  # Format from old API
    response = response.choices[0].message.content
    return response


def key_points_extraction(transcription):
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": "You are a proficient AI with a specialty in distilling information into key points. Based on the following text, identify and list the main points that were discussed or brought up. These should be the most important ideas, findings, or topics that are crucial to the essence of the discussion. Your goal is to provide a list that someone could read to quickly understand what was talked about."
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    response = response.choices[0].message.content
    return response


def action_item_extraction(transcription):
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": "You are an AI expert in analyzing conversations and extracting action items. Please review the text and identify any tasks, assignments, or actions that were agreed upon or mentioned as needing to be done. These could be tasks assigned to specific individuals, or general actions that the group has decided to take. Please list these action items clearly and concisely."
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    response = response.choices[0].message.content
    return response


def participant_list(transcription):
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": "You are an AI expert in analyzing conversations and extracting names and roles of the people speaking. Please review the text and identify each person named in the discussions, their title or role, and any other personal information they provide such as location.  Be sure to review the entire conversation and include new people named later in the meeting.  The meeting may be a company earnings conference call with analysts; if this is the case be sure to include the analysts asking questions later in the call.  Please list all of the the names and their related information clearly and concisely.  If there are clear groups of people, such as customer and supplier, group them accordingly"
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    response = response.choices[0].message.content
    return response


def ioi_extraction(transcription):  # Items of interest
    with open('ioi.txt', 'r') as file:  # Read items of interest from file
        ioi = file.read()
    ioi = ioi.replace('\n', ', ').strip(', ')  # Replace line breaks with commas
    content = "You are an AI expert in analyzing company earnings calls and extracting key items of interest specified by the user.  Please carefully review the entire text and identify if any of the following terms are mentioned in the transcript of this earnings call.  Start by breaking down the transcript into smaller chunks of less than 10,000 characters each.  Carefully search each chunk for the key terms of interest specified.  If any of the terms are mentioned, do 2 things:  1) repeat exactly exactly what was said about that term, and 2)explain what was meant by the discussion about that term.  Provide the output in an organized way.  When complete, review the transcription again to ensure none of the specified terms were missed.  Here is the list of terms: "
    content += ioi
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": content
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    response = response.choices[0].message.content
    return response


def sentiment_analysis(transcription):
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": "As an AI with expertise in language and emotion analysis, your task is to analyze the sentiment of the following text. Please consider the overall tone of the discussion, the emotion conveyed by the language used, and the context in which words and phrases are used. Indicate whether the sentiment is generally positive, negative, or neutral, and provide brief explanations for your analysis where possible."
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    response = response.choices[0].message.content
    return response


if __name__ == '__main__':

    # Segment audio file into smaller chunks if needed
    t0 = time.time()
    segments = split_mp3('FordQ3_f231026_1700_14254_archive.mp3')  # Split mp3 into segments small enough to transcribe
    t1 = time.time()

    # Transcribe audio
    transcription = transcribe_audio_list(segments)  # Transcribe each segment and return single combined transcription
    t2 = time.time()

    with open(r'C:\Users\GESco\Documents\Coding\GPT_Meeting_Minutes\Data\Q3_ford_transcription_3-Dec-2023.txt', 'w') as file:  # Save transcription as file
        file.write(transcription)

    # with open(r'C:\Users\GESco\Documents\Coding\GPT_Meeting_Minutes\Data\Q3_ford_transcription_2-Dec-2023.txt',
    #           'r') as file:  # Read transcription from file
    #     transcription = file.read()

    print('\n *** Transcription ***\n')
    print(transcription)
    t3 = time.time()

    # Create sections of meeting minutes
    summary = "\n Summary: \n" + abstract_summary_extraction(transcription)
    t4 = time.time()

    key_points = "\n\n Key Points \n" + key_points_extraction(transcription)
    t5 = time.time()

    action_items = "\n\n Action Items \n" + action_item_extraction(transcription)
    t6 = time.time()

    participants = "\n\n Participants \n" + participant_list(transcription)
    t7 = time.time()

    ioi_discussion = "\n\n Items of Interest \n" + ioi_extraction(transcription)
    t8 = time.time()

    sentiment = "\n\n Sentiment Analysis \n" + sentiment_analysis(transcription)
    t9 = time.time()

    # Create combined minutes
    minutes = summary
    minutes += key_points
    minutes += action_items
    minutes += participants
    minutes += ioi_discussion
    minutes += sentiment

    print('\n *** Minutes: ***\n')
    print(minutes)

    with open(r'C:\Users\GESco\Documents\Coding\GPT_Meeting_Minutes\Data\Q3_ford_minutes_3-Dec-2023.txt', 'w') as file:  # Save minutes
        file.write(minutes)
    t10 = time.time()

    print('\nsegment time =', t1 - t0)
    print('transcribe time = ', t2 - t1)
    print('print time = ', t3 - t2)
    print('summary time =', t4 - t3)
    print('key points time = ', t5 - t4)
    print('action items time = ', t6 - t5)
    print('participants time =', t7 - t6)
    print('ioi time = ', t8 - t7)
    print('sentiment time = ', t9 - t8)
    print('file time = ', t10 - t9)
    print('total time = ', t10 - t0)

Foxalabs · December 8, 2023, 6:49am

Hi and welcome to the Developer Forum!

Thanks for taking the time to make the video and and code, looks great, welcome to the community, hope you have fun here.

benasterisk · December 28, 2023, 4:53pm

Hi GEScott71,
Nice work ! i’am very new with coding and python particularly but trying to learn as much as i can.
I’am seeing you added functions to split mp3 above 25Mb
i’am currently struggling with the sample code provide here on openAI API Meeting Minutes tutorial using the demo wav file provided on same page "EarningsCall.wav.’
Apparently the max size accepted is around 26 mb(26214400 bytes) while the demo file is just above (26387214 bytes). if i look at log in console when running the script .

openai.APIStatusError: Error code: 413 - {‘error’: {‘message’: ‘Maximum content size limit (26214400) exceeded (26387214 bytes read)’, ‘type’: ‘server_error’, ‘param’: None, ‘code’: None}}
looking at file property it’s even 48.3 Mo
I was wondering if OpenAI really provided something that is not working and then that’s why you added your split file function ? or if i’am missing something ?
Thanks in advance if you could help me a bit.
regards

alexrosen · January 5, 2024, 7:26pm

Thanks for sharing this!

I noticed that in one of your prompts you asked it to break the transcript into chunks that were no more than 10K characters. Is that a common practice? How have you seen it affect the results?

I’m working on a project that is focused on finding key moments/quotes in transcripts of long government meetings, so this is very relevant to my use case.

GEScott71 · January 6, 2024, 2:53am

@benasterisk , yes that is how I see it to - the OpenAI tutorial code doesn’t work with the EarningsCall.wav demo file they provide unless you also implement the split file function. The pydub split capability is mentioned here, which the tutorial should at least point to: https://platform.openai.com/docs/guides/speech-to-text/longer-inputs

GEScott71 · January 6, 2024, 2:57am

@alexrosen , I couldn’t see much difference with our without that 10K character instruction. I recall reading somewhere in the OpenAI documentation that search would work better with 10K characters or less, but I can’t find the reference now. I actually thought I had removed that before I posted the code - I no longer include it.

You might be interested in this post as well: Tips for searching large text files via API?

alexrosen · January 6, 2024, 2:44pm

Got it. Thank you for following up!

siamak.moshiri · January 31, 2024, 12:58am

Have you tried this code for an audio file that’s less than 25MB? I’m getting multiple errors as follows:
Traceback (most recent call last):
File “C:\Users\siamo\Py-p4\A1-MeetingMinutes.py”, line 217, in
transcription = transcribe_audio_list(segments) # Transcribe each segment and return single combined transcription
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\siamo\Py-p4\A1-MeetingMinutes.py”, line 80, in transcribe_audio_list
transcription = client.audio.transcriptions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\siamo\Py-p4\openai-env\Lib\site-packages\openai\resources\audio\transcriptions.py”, line 101, in create
return self._post(
^^^^^^^^^^^
File “C:\Users\siamo\Py-p4\openai-env\Lib\site-packages\openai_base_client.py”, line 1180, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\siamo\Py-p4\openai-env\Lib\site-packages\openai_base_client.py”, line 869, in request
return self._request(
^^^^^^^^^^^^^^
File “C:\Users\siamo\Py-p4\openai-env\Lib\site-packages\openai_base_client.py”, line 960, in _request
raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {‘error’: {‘message’: ‘Incorrect API key provided: sk-sXLhQ**************************************5wSn. You can find your API key at https://platform.openai.com/account/api-keys.’, ‘type’: ‘invalid_request_error’, ‘param’: None, ‘code’: ‘invalid_api_key’}}

GEScott71 · January 31, 2024, 12:28pm

It appears this is just a single error, related to your OpenAI API key - it isn’t related to the file size. You need to put your key in a key.txt file, and provide the correct path to that file in the openai.api_key = open… statement.

lunalander · September 26, 2024, 8:00am

The OpenAI API key works, but I have issues in Jupyter notebook with simple file open requests in python:

Here is my code:

file_paths = ["/Users/pf/Documents/OpenAI_API/MeetingNotesGenerator/earningscall.mp3"]
audio_file = [open(path, "rb") for path in file_paths]
transcription = client.audio.transcriptions.create(
  model="whisper-1", 
  file=audio_file
)

and here is the error message output:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[9], line 3
      1 file_paths = ["/Users/pitforster/Documents/OpenAI_API/MeetingNotesGenerator/earningscall.mp3"]
      2 audio_file = [open(path, "rb") for path in file_paths]
----> 3 transcription = client.audio.transcriptions.create(
      4   model="whisper-1", 
      5   file=audio_file
      6 )
      7 #print(transcription.text)
      8 
      9 #transcription = transcribe_audio(audio_file_path)
   (...)
     12 
     13 #save_as_docx(minutes, 'meeting_minutes.docx')

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/resources/audio/transcriptions.py:120, in Transcriptions.create(self, file, model, language, prompt, response_format, temperature, timestamp_granularities, extra_headers, extra_query, extra_body, timeout)
     66 """
     67 Transcribes audio into the input language.
     68 
   (...)
    107   timeout: Override the client-level default timeout for this request, in seconds
    108 """
    109 body = deepcopy_minimal(
    110     {
    111         "file": file,
   (...)
    118     }
    119 )
--> 120 files = extract_files(cast(Mapping[str, object], body), paths=[["file"]])
    121 # It should be noted that the actual Content-Type header that will be
    122 # sent to the server will contain a `boundary` parameter, e.g.
    123 # multipart/form-data; boundary=---abc--
    124 extra_headers = {"Content-Type": "multipart/form-data", **(extra_headers or {})}

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/_utils/_utils.py:52, in extract_files(query, paths)
     50 files: list[tuple[str, FileTypes]] = []
     51 for path in paths:
---> 52     files.extend(_extract_items(query, path, index=0, flattened_key=None))
     53 return files

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/_utils/_utils.py:95, in _extract_items(obj, path, index, flattened_key)
     93     else:
     94         flattened_key += f"[{key}]"
---> 95     return _extract_items(
     96         item,
     97         path,
     98         index=index,
     99         flattened_key=flattened_key,
    100     )
    101 elif is_list(obj):
    102     if key != "<array>":

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/_utils/_utils.py:74, in _extract_items(obj, path, index, flattened_key)
     71 from .._files import assert_is_file_content
     73 # We have exhausted the path, return the entry we found.
---> 74 assert_is_file_content(obj, key=flattened_key)
     75 assert flattened_key is not None
     76 return [(flattened_key, cast(FileTypes, obj))]

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/_files.py:36, in assert_is_file_content(obj, key)
     34 if not is_file_content(obj):
     35     prefix = f"Expected entry at `{key}`" if key is not None else f"Expected file input `{obj!r}`"
---> 36     raise RuntimeError(
     37         f"{prefix} to be bytes, an io.IOBase instance, PathLike or a tuple but received {type(obj)} instead. See https://github.com/openai/openai-python/tree/main#file-uploads"
     38     ) from None

RuntimeError: Expected entry at `file` to be bytes, an io.IOBase instance, PathLike or a tuple but received <class 'list'> instead. See https://github.com/openai/openai-python/tree/main#file-uploads

The sample code from the OpenAI tutorial in QuickStart doesn’t work for me. So I changed the code as per your working sample (also works for me) with:

file_streams = [open(path, "rb") for path in file_paths]

Any hint appreciated!

_j · September 27, 2024, 12:33pm

Q: Does the OpenAI library or the endpoint accept lists as an input?

transcription = client.audio.transcriptions.create(
    model="whisper-1",
    file=[
        open("audio1.mp3", "rb"),  # First audio file
        open("audio2.mp3", "rb")   # Second audio file
    ]
)

RuntimeError: Expected entry at file to be bytes, an io.IOBase instance, PathLike or a tuple but received <class 'list'> instead. See https://github.com/openai/openai-python/tree/main#file-uploads

A: No.

You will have to iterate over multiple calls (or launch in parallel).

Have a big hint!

import asyncio
from openai import AsyncOpenAI

FILE_PATHS = ["nofile.x", "audio1.mp3", "audio2.mp3"]
client = AsyncOpenAI()

async def transcribe(path):
    try:
        with open(path, "rb") as f:
            return await client.audio.transcriptions.create(
                model="whisper-1", language="en", file=f,
                response_format="text"
            )
    except Exception:
        print(f"{path} failed, continuing...")
        return None

async def main():
    tasks = [transcribe(p) for p in FILE_PATHS]
    results = await asyncio.gather(*tasks)
    return [res for res in results if res]
    
if __name__ == "__main__":
    result_list = asyncio.run(main())  # Transcription strings
    print(result_list[0][:80], "...")  # show first

lunalander · September 27, 2024, 2:29pm

Thanks, I changed my code with your big hint !

_j · September 27, 2024, 7:19pm

Also, you should have a look at the “nofile” bad list example, that, besides a bad file, would replicate the exception that an API failure would cause. It places nothing into the list and just moves on. If you are actually using more than one file, return an object that indicates error on that list item instead.

This demonstrates 3 in 2 out with loss of positional relationship, so if you want to check back later and find out which files failed, or actually assign the transcript to the correct file, you’ll need to maintain this direct relationship between input and output instead, or take your direct resulting action within the try loop (such as saving results).

lunalander · September 28, 2024, 6:22am

I’m working my keyboard through the other tutorials and cookbook recipes this weekend.
But I will come back to the Meeting Minutes Tutorial. Then I’ll check the NOFILE situation, as Python is smart with its ‘try’.

Topic		Replies	Views
Best strategy on managing concurrent calls ? (Python/Asyncio) API whisper	1	4076	July 2, 2024
Prompt Fatigue Question For API Calls Prompting gpt-35-turbo	24	493	January 25, 2025
Audio Transcription API chunking_strategy option API	9	217	May 16, 2025
Problems using session.update with the realtime-api (issue with "input_audio_transcription") Bugs api-realtime , api-realtime-speech	10	2264	October 15, 2024
Getting Started with the OpenAI API and Node.js/JavaScript Documentation	39	58384	December 12, 2023

My take on the OpenAI Meeting Minutes tutorial

Related topics