Solving ESP32 voice assistant API errors

Refer to this page;

Github Page ; AI Voice assistant DAZI

The code works and chatgpt and TTS also works fine
Github Page of the code Code gives these temperoray keys from sTEB AI

// ByteDance ASR API configuration

const char* asr_api_key = “07fcb4a5-b7b2-45d8-864a-8cc0292380df”;

const char* asr_cluster = “volcengine_input_en”;

const char* openai_apiKey = “[api-key]GIoHe3Zm”;

Base URL from sTEB AI : api.chatanywhere.tech

This works fine, but is limited by number of reponses.

However, when I use my API keys, Chatgpt fails to respond. The errors is ; Failed to get response from ChatGpt. My model (gpt-4o-mini ; Paid subscription)

I use the ASR of Bytedance (Given above) with no issues. But when i use my API Key,

const char* openai_apiKey = “sk-proj-XLivMxXXXXXXXXXXXXXlmkoNKLkTNXXXXXHwA”; // (masked Key)

const char* openai_apiBaseUrl = openai v1/chat/completions
The Chatgpt responses are not received. Need help in resolving the issue.

Error logs; ASR Recognition Result
Hello, beautiful. How are you doing

[LLM] Sending to ChatGPT…
[ERROR] Failed to get ChatGPT response
Reconnecting WebSocket for new session…
WebSocket disconnected
Connecting WebSocket…
WebSocket connected

Recording started…

Request ID: 16997_40308
Sending config:
{“app”:{“cluster”:“volcengine_input_en”},“user”:{“uid”:“fc2cc64eb580”},“request”:{“reqid”:“16997_40308”,“nbest”:1,“workflow”:“audio_in,resample,partition,vad,fe,decode,itn,nlu_punctuate”,“result_type”:“full”,“sequence”:1},“audio”:{“format”:“raw”,“rate”:16000,“bits”:16,“channel”:1,“codec”:“raw”}}

[ASR] Listening… Speak now

ChatGPT is the website chatbot at chatgpt.com, where you pay a monthly subscription.

The OpenAI API is a completely different service, and requires pre-purchase of credits to pay for the API calls to AI models.

Have you purchased credits? Can you chat with the API AI here: https://platform.openai.com/chat/edit?models=gpt-4o-mini

The other caution is the mention of the name ByteDance. China is a disallowed country. Geolocated API calls originating from China IPs will be blocked, and can result in an organization ban.

Thank you for your response. Yes I am able to chat easily with the API in the link mentioned. As for the banned country, my API calls to Open AI are not from china. Secondly , i am only using ASR of “ Bytedance “,. As per my understanding ( i may be wrong) ; the speech is converted to text by the ASR (Bytedance) and then this text is stored by the ESP device, uploaded by the code running on the ESP device to OpenAI from my IP. Bytedance does not upload to OpenAI directly. So the responses TTS from OPenAI would come back to my IP and then played on the speaker of my device. ( Please confirm if this is what is happeneing).

Your original posting is a bit hard to understand, and I don’t know any of the services or software.

You have an OpenAI key openai_apiKey that you don’t show as a sk-proj- key that would be for OpenAI services, ending with e3Zm. Then you say you change it, to ending with HwA, that you got from your API project. That makes it sound like the field is for some proprietary service and their own keys.

Plus: if you don’t know where your keys are going, and you are sending them to some random github code written in Chinese or whatever service “sTEB AI” is, there’s a good chance you’ve found some code in the style of “put in your API key here for it to be stolen”. The code repo for that “dazi” or “dhazi” literally has an organization key hardcoded in it, along with a random-looking .cn domain and subdomain for another api, showing how bad it is. Then the other service is chatanywhere.apifox.cn. OpenAI keys and code, where it is prohibited by OpenAI?

So there’s a good chance you’ve already transmitted that key by putting it into the field where another key was. It should be deleted from the project.

It is important to understand the code you run, and understand the APIs being called for AI services.

A far more trustworthy place to start. GitHub - openai/openai-realtime-embedded: Instructions on how to use the Realtime API on Microcontrollers and Embedded Platforms

2 Likes

The only thing to add is that production best practices require careful handling of API keys. If a third party gets access, they could drain your credits and even put your account at risk.

I assume due diligence has been done, but as a precaution, revoking the API key would be the recommended next step.

Thank you for the caution. I have already deleted (Revoked ) all API keys used to configure the project of Voice Assistant. I was only using API keys provided by sTEB AI (Temporary Keys, not my keys) for testing the project. Once it was working fine, then i switched to my API Keys in the code and recomplied to code. So, the keys were safe safe. ANyways just to be sure i have revoced those keys. Now coming to the issue; the temperory keys provided by sTEB AI allowed the Voice assistant to seamlessly convert speech to text and send it to Open AI. get response from OpenAI in the form of text and then convert it to speech. When, i replaced the temporary keys in the code and recompiled it, now, the speech to text works fine, code uploads text to open AI. Now Open AI is supposed to process this and end response in the form of text. This is not happeneing. Open AI does not respond to the text input.

Error Log: Error logs;

ASR Recognition Result
Hello, beautiful. How are you doing

[LLM] Sending to ChatGPT…
[ERROR] Failed to get ChatGPT response
Reconnecting WebSocket for new session…

So the issue is ; with temporary keys provided by the sTEB AI page , the code works, but with My API keys code does not work, and fails while getting response from ChatGpt. I hope i am clear now.

Thanks

Thank you for caution. I have made sure API keys are not shared on any open forum. Also i have revoked old keys and generated new ones, just to be safe.

1 Like

Understand and answer: Are they (whoever) going to provide you API keys that work directly on the OpenAI API endpoints, because they are real OpenAI API keys (identical in length and prefix to ones you get from the OpenAI API platform site)? Are they going to pay money in credits for you or any random person on the internet to “test” or make any calls you want?

If the answer is no, someone else is not giving their OpenAI API keys out from their account to you, then you must conclude that a request is being sent to their service by putting an API key in that software field, not directly to OpenAI without some middleman API endpoint (that you do not understand).

Instructions to substitute in your API key for a proprietary one, to someone that even writes nonsense in their code “Sending to ChatGPT…”, is at the minimum going to fail, and worse for you if it succeeds. At the minimum, you must audit the code and the endpoints and were it trustworthy, find where you are supposed to direct API calls to the right URL.

The OpenAI Realtime API, the link I provided, does not need several steps of generating text and then sending to a text to speech model or a transcription model for producing text from audio to then send to a chatbot (which is going to be on the OpenAI transcriptions endpoint, not “Failed to get ChatGPT response”). Instead, the realtime AI model can understand and generate audio itself, with no intermediary steps.

Thank you and I really appreciate your continued caution in this regard. However, let me lay to rest the concern. Firstly, the entire source code of the project is available to me to check and audit for any malicious code or module. Secondly, the code developer provides a temporary API keys to configure in source code before compiling and running. At this stage My API keys are nowhere in the picture or being used anywhere. If the user is satisfied that the code is working with the ESP32 device and speech is converted to text by ASR, sent to Chatgpt and receive a response as text and convert back to speech. He can then replace the temporary keys with his own API keys and recompile the code into a binary for uploading on the ESP32 device.

Hence forth all API calls are made directly by the device to the Endpoint of Open AI. So I don’t understand how my keys can get compromised or misused by this process, where there is no middleman or redirected Malicious website which is intercepting the traffic from my device to the Endpoint and extracting the API Keys. Hope you have understood the process.

My Issue was simple; the temporary keys provided by the developer and used in the code worked and did return ChatGpt response (unless you are suggesting the developer spoofed response to appear as if they come from Chat Gpt, for which there is no evidence of any malicious code snippet in the source code.)

Once I was satisfied that the code and functionality with temporary keys was working, I recompiled the source code with my API Keys, the binary was then uploaded to my device, but now the ChatGpt responses failed. AT this stage, the developer API are no longer used.

The help I am requesting is to confirm that whether API keys generated on OpenAi for gpt-4o-mini allow ChatGpt to respond to chat query and response back to the sender.?? The website link you provided works for me directly, but via code ChatGpt fails to respond. Could there be a latency or WebSocket issue here?

Anyways once again thanks for spending time and effort to address this issue.

Here is a “playground” web page that I developed. It is not connected to any service - it makes connections directly to OpenAI:

It takes a provided pasted API key, and stores it in your local browser storage (and actually, wouldn’t need to store keys at all, but that was really annoying to input API keys again and again). Later, the orange button will clean up and remove anything the web page code stores in the browser.

A real key issued by OpenAI will start with sk-proj- if created in a project, or older account-based keys can start with only sk-.

Put the “developer’s provided” key in there after clicking the “lock” icon to show the API key input box, and try to send an AI request to OpenAI with that key, after picking a model and writing “hello” in a message.

Then, instead, use the API key you obtained from the OpenAI platform site in your organization.

If the provided “temporary key” does not work and is refused, but your own OpenAI API key works and can call a chat model like gpt-4o-mini AI successfully, then the ESP32 code you are using is not sending to OpenAI when it behaves the opposite.

Thankyou for sharing your playground. I used it and entered the API Key shared by Developer, I failed to get the response from Open AI. “ API Error

Error: Incorrect API key provided: sk-KkEHJ***************************************e3Zm. ”

My presumptions (Please correct me if I am wrong)

However, for the API to work from the application, the Base URL is required. The base URL shared by Developer is; # api.chatanywhere.tech, therefore I presume the Developer site is routing the Temporary API key from the application via his base URL site, converting it to his API key and calling Openai, receiving response and routing it back to application. This way his credits are being used for testing the application.

When i use my API Key and My base URL, the application is directly calling Open AI, where the response fails. (I CONFIRM THE ENTIRE SOURCE CODE OF THE APPLICATION HAS BEEN CHECKED AND THERE IS NO MALICIOUS CODE SNIPPET HIDING, WHICH CAN ROUTE THE API CALL VIA ANY THIRD PARTY SITE)

So the issue remains unresolved, my API calls from the application do not elicit a response from OpenAI while it works on your Playground.

What is the correct base URL to use in application.

There are two “chat” APIs offered by OpenAI.

  • Chat Completions (MyPlayground has the endpoint URL with a copy button)
  • Responses (released early 2025)

If the code is older than six months without a recent revamp, or doesn’t need advanced features, it is likely Chat Completions. You can tell quickly because the API of Chat Completions takes “messages”, while Responses takes “input”.

The platform site API Reference gives a full detail of each, along with OpenAI’s API “chat/prompts” giving a “get code” for the endpoint selected in a drop-down menu.

A base URL may be like:

https://api.openai.com

or can have more path:

https://api.openai.com/v1

or may need the final slash:

https://api.openai.com/v1/

depending on what the author of the code expects to be non-changing.

Success !! I managed to resolve the issue with no responses from ChatGpt. I should have looked more closely, but I guess I missed it everytime. The tweaking was required in the .cpp file where the Base URL was not configured correctly;

string path = _apiBaseUrl + “/chat/completions”;

path.replace(“openai website”, “”); // leaves only the /v1/… path

client.print(“POST " + path + " HTTP/1.1\r\n”);

client.print(“Host: openai website\r\n”);

client.print(“Content-Type: application/json\r\n”);

client.print("Authorization: Bearer ");

Thank you for your help, it prodded me to look deeply.

( Note; Since links are not allowed i have just written openai website)

So I have successfully manged to build a Voice assistant project a serverless AI voice assistant developed entirely on the ESP32 platform using the Arduino environment. It allows you to run AI voice interactions directly on ESP32 devices without the need for additional server support. The system provides complete voice interaction capabilities including speech recognition, AI processing, and text-to-speech output. It provides Advanced continuous voice conversation with real-time ASR and conversation memory.

Glad to hear you got your first call figured out.

Then, did you know: Chat Completions also supports audio in and out? You can talk directly to the AI model and have it return speech it generates, with no additional services, using a model such as gpt-audio.

1 Like

Thanks for sharing, we learn something new every day. Which means, I no longer have to use the ASR to convert speech to text. Thats great. I will try it out next. :grinning_face:

Hi!@Ragz,

I want to know how you tweaked the cpp file, I don’t really know much, but I really want to make this work as well since I need it for my thesis:(, hoping for your reply!