So I’ll take that as a “you didn’t try anything different”.
Well, I’ll try something different. Exactly what I documented.
Hour-long audio segment. Encode it to opus:
C:\chat\ai-examples\transcriptions>ffmpeg -i 1hr.wav -vn -map_metadata -1 -ac 1 -c:a libopus -b:a 12k -application voip 1hr.opus
ffmpeg version 2022-01-10-git-f37e66b393-full_build-www.gyan.dev Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 11.2.0 (Rev5, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 57. 18.100 / 57. 18.100
libavcodec 59. 20.100 / 59. 20.100
libavformat 59. 17.100 / 59. 17.100
libavdevice 59. 5.100 / 59. 5.100
libavfilter 8. 25.100 / 8. 25.100
libswscale 6. 5.100 / 6. 5.100
libswresample 4. 4.100 / 4. 4.100
libpostproc 56. 4.100 / 56. 4.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from '1hr.wav':
Duration: 03:33:07.27, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
Press [q] to stop, [?] for help
Output #0, opus, to '1hr.opus':
Metadata:
encoder : Lavf59.17.100
Stream #0:0: Audio: opus, 48000 Hz, mono, s16, 12 kb/s
Metadata:
encoder : Lavc59.20.100 libopus
Continuing the output of the encoder, instead of 20 minutes coming to 10MB, I’ve got 60 minutes under 5MB.
size= 5476kB time=01:00:00.01 bitrate= 12.5kbits/s speed=50.9x
video:0kB audio:5205kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 5.204176%
And then send it off for transcription. OpenAI doesn’t have .ogg listed on the API, but it looks like they removed it yet it is still working fine. Opus with an ogg extension.
import os
from openai import OpenAI
# Initialize the OpenAI client
client = OpenAI()
# Open the audio file
with open("1hr.opus.ogg", "rb") as audio_file:
# Create a transcription using the Whisper model
try:
transcription = client.audio.transcriptions.create(
file=audio_file,
language="en",
model="whisper-1",
prompt="Here is the radio show.",
response_format="json",
temperature=0.1)
except Exception as e:
print(f"An API error occurred: {e}")
transcribed_text = transcription.dict()['text']
# Save the transcribed text to a file
try:
with open("transcript.txt", "w") as file:
file.write(transcribed_text)
print("Transcribed text successfully saved to 'transcript.txt'.")
except Exception as e:
print(f"output file error: {e}")
print(f"{transcribed_text[:320]}\n...\n{transcribed_text[-320:]}")
All done, except for several places the transcript went bezerkers shortly after a song was played for a short bit…replacing over a minute with repeated loop of words. Maybe also because the talk got dirty, it didn’t accurately reproduce frank Howard Stern Show talk.
Summary
Yeah. Anyway, so Wolfie is observing JD, and we’ll get a report tomorrow to see if he finds out anything. (15:40 song 16:14) Wolfie is fully embedded as we speak. Where’s he observing him from? He can’t fit in that office. He’s right in my face. Oh, yeah? Where is he? (ERROR: 16:20) Like, where is he? He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. He’s right in my face. (17:35)I had a sort of a panic attack a week or two before where like my stomach was like clenched. What was going on? I don’t remember. I don’t remember what it was specifically. I bet you use emojis. I very rarely.
Just lost the plan when they were playing snippets of songs…
Just lost the plan for a whole sections when they were playing snippets of songs...
This guy’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. He’s still alive. La cunta calda, calda flor de flor. La cunta calda, calda ciudad santa. It’s good, isn’t it? Yeah. La cunta calda, calda flor de flor. Tic-tic-tic balda tus. Pienta tula tu. Tic-tic-tic balda tus. Pienta tula tu. Tic-tic-tic balda tus. Pienta tula tu. Got every word. You drop them tula punda boots. La cunta calda, calda tida cunda. La cunta calda, calda flor de flor. La cunta calda, calda ciudad cunda. I think this is the song Matt had slow dance to. When they’re coming down. Flor de flor. Flor de flor. Isn’t that nice? Thank you. Julian Vallard. Whole new take on that song. Hello. Tic-tic-tic balda tus. Pienta tula tu. Tic-tic-tic balda tus. Pienta tula tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Yeah, and she did look like she just crawled out of a grave, you know, like her clothing. Yeah. We need like the best music video director in the world to do it. Because I’m telling you, I see the Shakira thing. Oh, yes. Where she’s doing her dancing in like a… Tic-tic-tic balda tu. Belly dancing kind of thing. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. She’s gone viral already. Her background dancers would be just going to town. Viral as in like staph infection. Yeah, exactly. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Tic-tic-tic balda tu. Yeah, we put the microphone like on a six-foot pole. Ha ha ha. Uh. La cunta calda. It’s funny that cunt, cunt, cunta. La cunta calda, calda, suede cunta. La cunta calda, calda, flora. This guy’s awesome. La cunta calda, calda, suede cunta. This could be the flip side. La cunta calda, calda, flora.
So you might not want to invest $0.36 in transcribing a whole hour at once anyway…