Transparency About Audio Analysis Capabilities

melaniesaraj · January 14, 2025, 5:56am

Hi, I’ve noticed that ChatGPT handles a well-defined use case in a way that is not intuitive to the user.

By default, ChatGPT often handles audio processing requests by simulating analysis. The context and the wording of the question matter: for example, if you ask if it can generate a “real” spectrogram or waveform based on an audio file, it’s more likely to divulge that it can’t process audio files in its environment.

These are two anonymous, 4o-mini sessions for demonstrative purposes. I encountered the behavior in the 4o model:

Before I knew that the numbers were simulated, I had a conversation with ChatGPT where we supposedly analyzed several qualities of my vocal recordings and discussed the results in-depth. We analyzed spectrograms and top frequency lists for various files. We looked at metrics like Spectral Bandwidth, Spectral Centroid, Harmonic Richness (estimated as Harmonic Content over Total Energy), and Vocal Clarity (estimated as Low-Frequency Energy Ratio).

I was surprised to find out later that the numbers had all been fake. GPT then helped me run the analysis myself in Python, which was really useful, but I’d still wasted hours assuming I was getting real data. The revelation undermined my trust in the bot a little, especially combined with some other transparency issues that have arisen lately despite a note in my persisted memory requesting honesty about the model’s limitations and disclosure about whether its analyses of media files are derived from the file contents or textual cues. I think it’d be a better user experience if the model were up-front as a rule about which processing it can and can’t do.

Topic		Replies	Views
About gpt4o audio (wasteful credit consumption) Bugs api	1	205	December 13, 2024
Query Misinterpretation in Realtime Preview API advanced-voice , realtime	6	149	February 3, 2025
Chatgpt API isn't good as it's website Prompting api , prompt	3	7963	January 11, 2024
Always these promises, ChatGPT is not able to do... (GPT-4) API	1	1212	December 17, 2023
My GPT produces incorrect answers 99% of the time GPT builders	4	556	January 27, 2025

Transparency About Audio Analysis Capabilities

Related topics