A.I. iOS app now available!

My A.I. app has been approved and is now on the Apple App Store!

I wanted a portable speech recognition and text to speech interface to the OpenAI API for myself, so I wrote one. You might like it too.

It went into submission last week and now that the ChatGPT endpoint and gpt-3.5-turbo model are available I’m updating it to work with that as well.

I’m also really excited about setting up app-intents for integration with Siri.


Congrats! Nothing like that feeling.

Best of luck with your launch!

1 Like

About ten years ago I wrote a simple app to help my kid with arithmetic - MathTimeAttack!

This one is way more exciting.

1 Like

We wrote a Swift library for the ChatGPT API :muscle: Feel free to submit a pull request to help grow the library. The library is written in Swift and supports the following platforms: iOS, macOS, tvOS, and watchOS.

link to the library: GitHub - FuturraGroup/OpenAI: A library that makes it easy to use ChatGPT API

Here is some more information about the A.I. app.

This is the main Prompt Response screen:

Click on the green mic icon at the bottom to ask A.I. whatever you want to know.

As you speak you can see the transcript of your prompt build in real time.

If there are errors or you change your mind, you can hit the Cancel button to end dictation and not send the question.

You can also click on the text for the prompt at the top of the page to use the keyboard to enter/correct the prompt.

Explore the Settings by clicking the gear icon in the upper right. There are options for the Speech Synthesis, Text Completion and Images.

It’s fun to experiment with different voices and the pitch/tempo. And you can install more voices in the Settings/ Accessibility/ SpokenContent/ Voices section (you may need to enable Speak Selection to see the Voices option

Anything you put in the Settings “System Message” will get sent along with every prompt and helps guide the response. Limiting or suggesting how many words to reply can keep it reasonable. You can also ask for it to rhyme, or use a particular style.

If images are enabled, clicking the picture icon will load new images for the previous prompt, and new ones will load for each subsequent prompt. Keep in mind that image reqeuests are more expensive than text completions.

If you want to save a response and/or images click on the little share icon to copy everything to the clipboad. You can then paste it into Messages/Mail/Notes/whatever.

There are a number of Settings to experiment with.
The first section is the Voice Settings.

Here you can select from any of the voices installed on your device, adjust the Pitch and Rate and click the “Test Voice” to hear it say whatever text you have in the Speech Test Phrase.

Its fun, and many of the voices sound very good.

If you want to revert experimental changes, just click the “Reset Voice Settings” button - that will not affect any of the other app settings, but will restore the voice settings.

I encourage you to go in your device Settings/ Accessibility/ SpokenContent/ Voices. There you can browse through the available voices and download any that you like.

The second section is the Prompt Parameters.

Dictation Timeout - sets how long the speech recognition waits after the last word it hears. You can always click the mic button when it is red to end dictation, but this timeout is intended to make it easy to speak and have it end automatically.

Get Text Completion - if you disable this option, the A.I. app will not request text completions. With it enabled, the following options are available:

Chat History - sets the number of previous prompt/response pairs to send with each new prompt. This provides the model with context to the conversation and a history of what you have asked and what it has already replied. This makes the interaction more flexible and you can refer to previous questions and responses. Sending more pairs gives more context, but also uses more tokens.

Temperature - this sets how strict or loose the responses are. Higher values (greater than 1.0) will make the output more random. Lower values will make it more focused and deterministic.

Max Tokens - limits the maximum number of tokens allowed for the generated answer.

System Message - this gets sent as a system role message at the base of the array with any chat history. It can help you set the tone and character for how the system responds to you. The default is only a suggesion, and you can experiment with this to see how it changes the results. You can also ask for it to use a particular style for the repsonses, emulate a type of response (stand up comedian?) or suggest limits for the types of replies.

The last sentance in the default message: “If asked about an image be positive about showing it” is intended to help if you do have images enabled. The text completion and image requests are sperate endpoints, and I found that without this sort of guidance the text response would sometimes be “No, I cant show you pitcures of racecars” when I asked to see pictures of racecars - and did in fact get them from the image request.

The third section is the Image Options.

Get Images - enables or disables requests for your prompt from the image endpoint. With it enabled, the following options are available:

Image Count - 1 to 4 images can be requested.

Image Size - they can be 256x256, 512x512 or 1024x1024. On the phone the low resolution images sill look pretty good, but depending on your use the larger ones may be better.

Do take care with image requests. They are more expensive than the text completions and you may use more tokens than you realize requesting many large images.

The fourth and final section is the Debug Options.

Debug Mode - enables or disables extra debug display information on the Prompt Response view.

When the Debug Mode is active, there are additional UI elements to show when the system is Listening Speaking and Thinking.

It also shows a text view summarizing the messages currently being sent as the Chat History.

My A.I. iOS App - “Hey Siri, ask A.I. a question!”

Here’s a quick video overview. I’m really happy with how the integration with Siri has come together.

Cool app! But could you make the text chat more like the web version, also it would be great if markdown is intergrated (I mean for the code block and perhaps even equations).

Thanks! Perhaps, but my initial goal was focused on speech recognition and text to speech in a mobile app.

If you are working on code, you are probably better suited with a desktop interface anyway.

I also want to keep it easy to share the responses as text/images by copying to the clipboard to paste into Messages/email/whatever. Getting fancier with markdown may make it more complicated both for the speech synthesis and the clipboard, but I’ll take a look at what’s involved with that.

Hi Robert!,

I’m Claire, I work with the Blockchain projects such as M2E , P2E, Metaverse, Alts, and NFTs- all are already live on ios, web and android.

I’m looking for the right person to talk to about the possible partnership opportunity on Blockchain Space. Open AI / GPT as Part of our Intergration tools our ecosystem.

We are constantly looking for innovative ways to bring the future of digital ownership to life while providing an enjoyable experience for our community accross globe.

Anyone interested todo a partnership with us and get to know more details ,
let’s connect on LinkedIn: https://www.linkedin.com/in/clairepikebalubal5207418
Links: https://tap.bio/@BlockTalk/cards/640686

Overwhelming user feedback has been that the UX for the first time use of my A.I.app was BROKEN. I’ve put aside my defensiveness and agree, it was pretty bad.

Here’s a revision. What do you think? Is it clear and easy to understand, or are there additional changes you would suggest?

The help/overview page is here: https://robertsmania.com/ai