Clarifying Content Policy on Discussing Personal Experiences

On the matter of technical solutions, to my understanding these settings are tweakable (beyond mere prompt design -to adequately incorporate standing guardrail policies-) in a similar manner that one can specify weights for a LoRa model.

However, this method is a double-edged sword because it ails from the “paperclip maximizer” paradox. Meaning that if you know you do not want some specific kind of content to be discussed in conversations, and you assign to this prohibition the maximum negative weight, you may well be, as a matter of fact, inadvertently making all related topics -however loosely related- unavailable as well.

Therefore IMHO developers and policymakers should work together to experiment with various combinations of weights so to reach an optimal scenario.

There are several semi/automatic strategies to achieve this beyond mere trial and error, such as multiobjective optimization techniques that consider various linear combinations of weights simultaneously and then return a set of candidate solutions along a Pareto front, from which they can be cherry-picked according to domain-specific knowledge, or even by using a more sophisticated selection algorithm to account for spread of solutions along fronts, and other variables.

But for now it seems that they are taking the sledgehammer approach. And I don’t blame them… surely they have a lot of stuff to think about.

I am just concerned about the adverse effects that this approach is having on the product and its perception by the user base.

2 Likes