So, I’ll try to keep this as short as possible but it’s gonna be long anyway.
Okay, so, the first thing I used it for as an accessibility tool was getting top (1) Linux output into a narrative format. I also do that with just about any command output that’s in like tabular format. I’m not a developer, but I try to use the command line because sometimes its easier than trying to use a GUI.
The second thing is describing images. Be My Eyes is great on the iPhone, but they don’t have a desktop app, and their Android version doesn’t have the feature where I can send photos to it. So I have to use the ChatGPT app for that. I’ll get to the accessibility of the ChatGPT services in a while. So, describing images can be really great, but it’s hard to tell if its accurate because I’m blind, so can’t judge the accuracy myself. So I don’t rely on it all the time, but it is really nice for pictures sent to me, or public social media stuff like Facebook pictures.
The next thing is video games. And this is kinda where it fails sometimes. So, in visual novel games, like the story mode of Blazblue, it works really well, but in action games, like Chrono Trigger, or Final Fantasy, it doesn’t know where things are, and can’t always lead me to things. Also, I’ve tried sending the OpenAI API a batch of images to try to describe them like a video, and it can’t handle more than maybe 50 images at a time. So, sending it videos of gameplay doesn’t seem like it’s gonna work well.
Okay, so the ChatGPT apps and such themselves. iOS works best, naturally. VoiceOver speaks new messages as they appear, and the app works pretty well. The Android app is accessible, but messages aren’t automatically spoken by TalkBack. On the web, it’s much worse. The place where you choose a model is just text to screen readers, even though when you press Enter, it opens a list to choose a model. So, a screen reader user wouldn’t think to press Enter on it, since it’s not a button, or link, or dropdown box (combo box as screen readers call them). Messages aren’t spoken as they appear, so we have to keep checking the “stop generating” button to see when it changes to “regenerate” or whatever it says, to see when the message is done generating. AI companies don’t really focus on accessibility, besides Google and somewhat Microsoft, and they have accessibility teams that have to work very hard to be heard over all the other teams.
So, we have to make our own interfaces to OpenAI. We have this NVDA addon: GitHub - aaclause/nvda-OpenAI: Open AI NVDA add-on which works very well. I use it a lot. But, it uses the OpenAI API, even though I already pay $20 a month for ChatGPT Plus. So, I’m essentially paying extra for accessibility. Also, the Vision API that regular OpenAI API can access is different than the API that Be My AI uses, so I’ve heard, so we don’t get details like popular characters in comic books, or place names and such that Be My AI happily gives us. So even when we use accessibility tools, we still get left out is some cases, because Be My Eyes is a small team, that prioritizes iOS over what a lot of blind people use more, the web/desktop and Android.
Oh yes, I’ve also tried reading comic books with GPT4 Vision, and it works well on Be My AI, the GPT4 Vision part of Be My Eyes, but since I have an iPhone SE 2020, a tiny iPhone because I’m not rich, it can’t read the speech bubbles. So no comic books. And on a PC where I can send it big enough pictures, it’ll sometimes say it’s not allowed to read stuff. So no BatMan: the Killing Joke, for me.
Overall though, it’s pretty helpful. I wish it had much more data within the model about accessibility, blindness forums and accessibility blogs, and and standards like WCAG, and Braille, but it’s gotten better since GPT4 came out. I know a lot of people just use the free 3.5 and give it a bad name because of that, but it has improved. I just wish access to images were equally good across desktop, web, and mobile, that AI companies paid tons more attention to accessibility (like those AI pins and phones that have no accessibility info whatsoever), and that ChatGPT was accessible everywhere.