I’ve got a LOT of dialogues and texts that I’ve been running embeddings on. Mostly they’ve been organized into different topic areas. What I’m finding is that topics related to emotions and motivations appear more likely to trigger the mod filter. Small amounts of triggers are pretty normal for some topics (shopping seemed to trigger .001% which is great) but others are much higher especially in documents that deal with people’s fears (closer to violence even when it’s indirect), and inner thoughts an monologues (possible self harm etc). So far I think we are staying well within the lines, but I have seen a few occasional surprises (particularly around documents that deal with character thoughts/motives) and so while these are part of our internal R&D activities for chatbot and prose writing, with such large volumes of files I do worry a lot about a mistake or single bad document that could cost me my API access. For that reason, for any analysis and R&D on documents that more directly deal with things like scary emotions, and discussion that sometimes lean into political areas (as we want to be able to recognize all categories of real world discussion even if we don’t generate them) we use another (lesser) embedding service altogether… I’ve avoided entering these slightly riskier docs into OpenAIs embeddings for all the reasons I mention above. They’re probably just fine, but I simply do not know for sure. I’d like to use Davinci for all of them, I just worry that I can’t, and want to be on the safe side.
So my question is… has anyone else had to contend with embedding large documents that because of their volume are likely to contain at least a bit of potentially objectionable material? How did you handle that? And second, is there a way to request special permission to embed topics and material on subject areas that may contain a handful of areas that are not suitable for generations, but where you’d still like to be able to more accurately detect and appropriately respond to topics that you wish to steer around? Politics is probably the best example I can think of, since it overlaps heavily with news, current events, and ones’ worldview. So having a more nuanced recognition of it would be incredibly helpful to knowing when and how to steer around it, or whether it can be an area to safely engage in something related.
For example, some obviously political and controversial topics may warrant a blatant wall “I’m sorry, I’m not able to discuss that”. Others’ might be someone expressing a political point of view that might create an interesting, fun, and safe discussion thread if managed well. e.g. “Sounds like you have some interesting thought on the benefits of capitalism and free markets. Do you think that will that be equally important in the the future? Why do you think there’s no such thing as money in the Star Trek universe?”
In future, we are also going to be getting into (with expert feedback) coaching applications. If we do as good a job as I think we can, then individuals are likely going to express their emotions, fears and feelings. Again, I worry about whether these things may trigger safety filters also. And for obvious reasons I’m not keen to test it for fear of my API access being lost.
Any suggestions or advice in these areas would be most helpful. For the most part, the default it to simply steer things as conservatively as I can, but I do wish there was a way to use the embeddings with a broader range of material so that we can better detect emotions, topics, and situations in a more nuanced way.