Tricks for diversity of outputs when producing large sets?

MikeG · July 20, 2023, 4:37pm

Hey all! I am writing about 200 synthetic day-in-the-life journal entries. Ideally, these are from different “people” and focus on different subjects.

Logically, when you run a generic prompt to “write a journal entry about your day, include specifics and details…” the entries come out pretty similar.

I generated a python list of 200 different jobs, ages, martial statuses, etc. Then I ran my prompt in a while loop and appended to the end of each prompt “Write as if you are a {insert specific life situation from python list}.” This effectively gave me 200 different prompts.

However, while this approach improved my outputs a little, but there is still a very generic feel and structure to all of them. Like 20% of them start with some variation of “Man, today was a rollercoaster!”

I’m wondering if people have tried to achieve something similar and what they have done?

_j · July 20, 2023, 4:52pm

I’ve tried noncing with irrelevant context to ignore, but the AI does a pretty good job of ignoring. You can still try that though to modify the instruction following quality by large contexts.

One plausible costly concept: Load 16k GPT3 API up with dozens or all 200 profiles. Then when you have the instruction for “write #88’s daily journal”, there will be different positional lossy effects from the large context.

Roleplay characters are all now very canned and hammy. It’s almost like OpenAI decided to add “Ah, I see” fakery detection phrases to everything that’s not a chatbot - and I get back again exactly what I expect again:

Act as a learned professor in the art of juggling mousetraps.
Ah, welcome, my esteemed audience,

Curses, feeble-minded bot.

You can try “write in the style of” - and then pick somebody unexpected that doesn’t have catchphrases.

There’s a whole new world of output to explore if you use the API davinci GPT-3 engine.

codie · July 20, 2023, 4:56pm

You probably just need to add a “style” list and turn top_p to 1. Sounds like you have life content, but not personality content. Like “Cheery southerner”, “Serious educated business man.”

Be careful with race and region though, I’ve noticed it sometimes writes racist stuff. Specifically with asians.

vb · July 20, 2023, 5:09pm

Hi! You can start with something like “How do you imagine a chef in a Italian restaurant to speak?” and store it with the role. Next time you request the journal entry pass it as context and prompt the model to use the style.

Of course you can’t ask the model to imagine anything but you get the idea.

Hope this helps!

MikeG · July 24, 2023, 6:37pm

Interesting! Thanks for this idea. I may base one off of ENTJ Myers Briggs.

wswitzer · July 27, 2023, 6:56pm

On top of using MBTI, you may be able to apply Plutchiks Wheel of Emotions to have the bot choose an emotion for the output. A sad ENTJ, a blissful ENTP, an angry ISTP, etc… we might apply to a bot we’re building, but instead of MBTI using Enneagram with wings and levels of integration… I proposed it, but we’re still work on some base functionality.

MikeG · July 28, 2023, 12:03am

So I’ve put in a couple days on this now with some troubleshooting. I found that the models don’t understand a simple MBTI anagram very well. You get weird stuff like, “As an ENTJ I enjoy challenges in the workplace…”

What I ended up doing was getting GPT-4 to create an array of 100 unique emotional states from as general as “Angry” to as specific as “yearning for unrequited love”. Then I pick a random one each time and insert it into the prompt in a line like "The overwhelming emotion you are feeling at this moment is: {random emotion}.

Foxalabs · July 28, 2023, 6:18am

Hi,

I just noticed you mentioning anagram manipulation issues, this is often down to the way the model tokenizes word parts, to you ENTJ looks like 4 letters, but to the model it looks like this 2 tokens.

As a test, can you try inserting a separator character such as - (dash) to give this E-N-T-J, and the model will now see that as 4 separate tokens.

You can then extract the - to rebuild your string. See how that works for you.

MikeG · July 28, 2023, 11:50pm

This is a high-quality insight. Thank you. I will explore this.

_j · July 29, 2023, 12:24am

I had an idea, where you just take one from each column to make your personality. This is the bot’s idea of meeting my specs.

AI: Below is a table with four characteristic categories as column headers, and 30 rows with writing examples and corresponding personality characteristics. The columns group similar facets of information together, and the entries within each row do not conflict with each other. Please note that these are fictional characters, and the provided personality characteristics are purely for creating distinct writing styles:

Personality Trait	Age & Ability	Positive	Negative
1. Optimistic	35 years old, Novelist	Always sees the bright side	Tends to be naive
2. Cynical	42 years old, Journalist	Sharp wit and satire	Pessimistic worldview
3. Enthusiastic	28 years old, Blogger	Exclamation marks aplenty	Sometimes overly verbose
4. Analytical	45 years old, Researcher	Logical reasoning and evidence-based writing	Lacks emotional depth
5. Whimsical	19 years old, Student	Imaginative and creative language	Difficulty staying focused
6. Precise	50 years old, Editor	Meticulous attention to detail	Can be overly pedantic
7. Laid-back	33 years old, Traveler	Informal, colloquial expressions	May come across as lazy
8. Authoritative	60 years old, Historian	Firm and confident tone	Tends to be domineering
9. Empathetic	27 years old, Therapist	Understanding and compassionate	Overuses qualifiers (e.g., maybe)
10. Humorous	38 years old, Comedian	Witty remarks and humorous anecdotes	May rely too much on jokes
11. Pensive	31 years old, Philosopher	Thought-provoking and contemplative	Can be overly abstract
12. Confident	48 years old, CEO	Assertive language	Comes across as arrogant
13. Nurturing	29 years old, Parent	Warm and caring tone	May be too sentimental
14. Stoic	55 years old, Veteran	Dispassionate and stoic writing	Lack of emotional expression
15. Playful	22 years old, Gamer	Uses gaming slang and metaphors	May lack depth in serious topics
16. Persuasive	37 years old, Salesperson	Convincing arguments and rhetoric	May oversell or manipulate
17. Curious	26 years old, Explorer	Inquisitive and eager to learn	Easily distracted by new ideas
18. Diplomatic	43 years old, Diplomat	Tactful and diplomatic language	May avoid taking clear stances
19. Rebellious	20 years old, Rebel	Defiant and anti-establishment	Can be confrontational
20. Melancholic	41 years old, Poet	Reflective and emotional	May lean towards melodrama

AI: You can now combine one characteristic from each column to create a unique personality for a character. For example, “joe’s writing is: Rebellious, 20 years old, Defiant, and tends to be confrontational.” This combination will create a distinctive writing style for Joe based on the specified attributes.

Obviously cripplebot doesn’t write at the length we want, but you can probably just “continue” this list, or drop a refined table header for a different list.

Topic		Replies	Views
Train a GPT model in my tone API	30	12642	December 17, 2023
Ideas needed for a prompt Prompting	2	1366	February 21, 2023
Looking for tips on improving this prompt, especially for personality Prompting gpt-35-turbo , chatgpt , api	9	3680	December 14, 2023
Training gpt-3.5 to autocomplete for a niche domain and a specific writing style Community chatgpt	13	1934	July 25, 2024
Text "sculpting" for long-form completions Prompting	3	1599	June 14, 2021

Tricks for diversity of outputs when producing large sets?

Related topics