I have built a “RAG” = “Retrieval Augmented Generation” implementation on top of a Drupal CMS. Basically a fancy way of saying a Drupal “chat with your pdf” system.
Seeing that we have a whole generation of young people who have grown up using only their smartphones for news and information (and social contact), I though it would be a good idea to develop a smartphone app to query the system. To that end, I created REST API access. Then I thought, if I’m going through all this trouble, why not add the ability to send queries via email? And, eventually, text message?
I’m thinking both of these ae great ideas for the demographic who will use their smartphone before they use a computer, laptop or tablet.
I’m wondering what do you out there in the OpenAI developer community think about this?
How? Token usage depends upon queries, not delivery method. Unless you mean referring to the development costs. Also, with SMS, I expect there is a cost for each text message? Then, if you are like me, using a 3rd party delivery system (Zapier), there is that as well. So, I guess you’ve got a point as I think about it.
Yeah, that’s what I was referring to. I imagine it’s a lot cheaper now than when I looked into it in 2012 or so… some authors went that route back in the day…
Yeah, about a year ago I interfaced GPT-3 Davinci to my smartphone over SMS. It’s awesome! Haven’t had time to refactor to GPT-4, but yeah, it’s a good way to have a personal assistant while you are at Costco or whatever.
I figure once I get email working, sms should be a piece of cake (already worked with these Zaps before). You just made me realize that if I get SMS working, I won’t need to develop an app! (and learn React) It’ll be the same difference, just way cooler.
Yeah you don’t need React or any sort of web framework for it to work. SMS or Email is great for back and forth communication.
Like you mentioned, you could do it through Zapier.
Me, I used the Twilio API, AWS (Lambda + DynamoDB + API Gateway), and of course the OpenAI API.
From my phone, I can even change the tone or personality of the bot by issuing the commands over SMS, and this would set a different prompt for all future responses until I changed it to something else.
It’s a fun project and very useful! And still relevant, even with ChatGPT on smartphones, because you can utilize embeddings (semantic search, RAG, etc) to really customize your personal assistant.
So, I’ve got the SMS working now. Only needed the Vonage API and a number.
I don’t need to change the personality, but I was curious if there was a way to change the libraries. On the website query screen, I have several libraries a user can select to narrow his search.
These are metadata options used to filter the cosine similarity vector search.
Now, with the SMS interface, I basically only have the text message itself to work with. Can you think of a strategy for being able to select specific libraries for the query? Maybe functions?
You could use cosine similarity on the input and against your categories (and further details of your categories) to dynamically put you in a category each time, or have it integrate and determine the category after a few initial turns.
You could try a fine-tune to categorize what the topic is, but that could get weird with all the categories you have.
Or you can do progressive keyword correlations … using something like BM25. Working on my own keyword correlation algorithm now, so it’s on my brain.
Thinking this through, I came up with another alternative – good old regex. I instruct users to say a specific phrase (you know, like with Google Assistance “AI” – lol) like “select libraries:” or “include libraries:” then use regex to capture that input.
However, the next problem is maintaining that library “state” through the next query in the same session. Fortunately, I already have a model for this as I’ve been using good ol’ SQL to maintain chat history in API calls.
When I think about all the legacy technology that is necessary to efficiently utilize this new technology, I’m struck by all the YouTubers out there saying “AI is going to replace developers.” Yeah, right.
As a long-term dream project you could think about linkinh whisper, gpt-4, and a voice from elevenlabs to an asterisk box so you can call and talk to gpt-4 from anywhere. But, the real power though would come from being able to have it make calls on your behalf.
I’ve started working with Twilio in addition to Vonage. Twilio has a Whatsapp API, so I may look into that. I personally don’t use Whatsapp, but I guess I might have to take the plunge as everyone else in the world apparently does use it! At least everyone under 45!
You probably could build your tool with their free credits at Vontage
Access to GPT via sms is a great idea
Say give the user like 100 free then hit them with a link to get more access or less
Using vontage is fun and easy I did a mock up prototype using it and used up all my free credit
I might need to give them a credit card this would be a great addition to my site
Who knows it could become a thing I can’t guess what this generation thinks
But it would be handy if you wanted a quick answer in the middle of a conversation on text
Email? Yeah it could be helpful there too include GPT as a cc and join the email thread hell yeah . Going to be a lot more challenging than sms however. But it could be hacked pretty sure, could dust off the old text based mail server.
That was the easy part. The hard part was figuring out how to authenticate the emails since anyone can spoof anyone else’s email address. The solution I came up with was to use email verification on the first API call, then the user can send emails for X minutes until the certification times out.
Only way I see to do it right now as leaving it open is asking for trouble.
I don’t see where the problem with making calls would be.
Let a software make a call. And then send an initial request to gpt-4 like “hey, there is a guy on the phone you just called him to promote…”
I was more like thinking what happens if “a” bot calls someone records a little bit of his voice, does some internetresearch and finds relatives of that guy and then calls them with his voice, and also records their voice to call him… taking everyone out of their reality and causing any kind of bad stuff… like traffic jams in the lowest (let’s all meet right now there and there!!!). to financial chaos or even I mean what would happen If a bot breaks up with all partners in the world?
Could be taken care of long time ago. e.g. by two factor auth before you can use your phone.
The “two factor auth” was just a cynical joke about security.
Now imagine a “small” drone army with grenades on that flies into your open window in the middle of the night and “asks” you to hand over your phone and pin. Then drops the grenade and flies away with it.
Would be possible to collect thousands of phones in no time (imagine some secret service would want to take over a country and has the capital and man power). And the bot takes over the identity of that person… this all happens in a couple of hours.
And then some cloud administrators, politicians and generals call in sick…