Extracting Personalities from past Conversations?

I am wondering what others ideas are on extracting ‘personalities’ using past conversations as training data. For example, suppose we are only interested in one persons personality based in N past conversation transcripts they have had.

Here is a simple example:

Let’s say I want to replicate a bot that responds like Dieter (Mike Meyers) from the SNL skit “Sprockets!”.

There are 14 “Sprockets” skits from 1989 to 1997. So not a lot of data, although there is also a movie screenplay out there too (a bit more data). When I ask GPT-3 to emulate “Deiter from Sprockets” it does a poor job IMO … probably because a lack of data … so just using basic GPT-3 prompting is out. Also, I’m thinking of the case of a random person GPT-3 doesn’t know about, so GPT-3 basic prompting is out in this case too.

So to solve these problems, let’s just say I have tons of conversational data, think all the Johnny Carson transcripts, where the person of interest is talking to another person. This isn’t just a question/answer session, but a conversation.

So with this data, where the person of interest is talking to other people … and you have lots of conversations on record and in text format … what would be the best way to encode this person in a model?

Here is what I wish … Just upload all the conversations with “Dieter” and “Person X” and let the network train itself on how Dieter typically responds to people.

Now I already know what some of the responses might be … here is what you might answer with:

  1. Use prompts with examples of “Deiter” and “Person X” and then ask GPT-3 to complete it for new incoming text.

My concern with this is more information theoretic … the prompt size is too small to really capture the entire personality of someone. You can get a decent response, but not a very accurate one.

  1. Use embeddings to get the most relevant selections of “Dieter” and then feed these more accurate and/or relevant details to the prompt and ask GPT-3 to complete it.

Again, my concern is information theoretic … the amount of bits in someones personality is much bigger than the prompt size of GPT-3. But this technique should give a better answer than (1) since the prompt is more relevant to the input.

OK, so the only option it seems like I am left with is a fine-tune, where I can influence a large set of network parameters (OK, so I’m feeling better about not having enough information). And this is where I get stuck. Fine-tuning seems to be a one way thing … and the direction is always “answering” or “concluding” something from the input prompt. The only way I can think around this is to have multiple fine tunes going in both directions (questioning and answering) and somehow having a metric decide if this is incoming our outgoing and then use the appropriate fine-tune respond.

Maybe I answered my own question, but who knows, curious as to what others have tried.

And oh, if there is a way to upload conversations and let the network train and extract how the person of interest would respond, it would make this extraction process much more straightforward!

6 Likes

I have the same use case/question and am surprised no-one seems to talk about this.

Definitely the best model is embedding with semantic search in a QA model based on lots of text that person wrote, with fine tuning on top of it. I’ve completed the former, but as you say, it get’s the information, but doesn’t ‘speak’ syntactically like that person.

I’m going to be experimenting more with fine tuning and will lyk how it goes, but the struggle here is having the correct format to fine tune writings/speaking’s in. If you want to fine tune based on the text of a book, what’s the right input to write for those output paragraphs. There seems to be no good design for this particular situation.

1 Like

Very interesting topic, unfortunately no discussion on it. Have you managed to achieve anything further?

My idea is to train the chatbot to speak like Aristotle and my idea was feeding it Nichomachean Ethics and some of his other works. Not sure this would work like I am supposing it will, so I would love to hear your experience.

Could it be beneficial to batch & analyze the conversations to extract personality from it?

Read this batch of conversation and extract/reinforce any personality traits from Dieter
Conversation:
Personality Traits: (filled with previous entries)

After batching lots of conversations, maybe you would have a strong personality profile of the character?

I’m also interested in seeing how fine-tuning plays out. It would be very useful to test your model. Ex. Feed it 75% of all conversations, and then see if it can complete 25% of the other conversations in the same way.

Good luck and please let us know of your findings!

1 Like

I’ve been testing personality and character via fine tuning. After a few false starts I’ve had some reasonable results, but only with a lot of phrases. I suspect doing this well means having a huge amount of conversational data. I will be pursuing this further in the coming months.

3 Likes

I think I found a solution in another thread on this forum. I was going about it wrong.

You need to take text from the persona you want to clone, run it through GPT to neutralize it, then create a fine-tune on the inverse, so fine-tune Neutralized → Persona.

You have one fine-tuned model for each persona. To render the persona, you take the neutralized text and run it through the fine-tuned model.

More conversation over at this thread:

1 Like

I’m curious about your results, please let us know. I’ve been facing a similar problem in a different context

My main concern is that i’ve been reading that fine-tuning will not ‘add information’, but will teach a specific task. If this is true, I am not sure fine-tuning will solve this issue…

Any thought about this?

Right, so to add information, use embeddings, then render this new information through the fine-tune to create the persona or “voice”. You need both.

2 Likes

Hey Curt mind providing an example of how embedding would work in this scenario?

A quick rundown …

The persona has two components, (1) Voice/Style and (2) Information to convey.

For the information you need to convey, follow the standard embedding approach HERE!

The only nuance is that your information is in the neutral voice, and your completion is in the neutral voice.

Then, you take the completion in the neutral voice and send it to the fine-tune that adds the Voice/Style.

It’s a two-step process. First step is information retrieval using embeddings. Second step adds tone/personality with the fine-tune.

You can use GPT-4 and put a generic voice in the system message (I do this). But for very specific voices that GPT hasn’t been trained on (like your own voice, or some unknown semi-celebrity) you need to use the fine-tune approach to create this voice.

Caveat: I haven’t tried the fine-tune yet to make a voice, but it theoretically makes sense. Let me know if it works!

Note: If you are good with the native GPT responses, then you can skip the embeddings. But you still need to add the Voice/Style from the fine-tune.

1 Like

Awesome, thanks @curt.kennedy. This information is super helpful.

1 Like

Hey @curt.kennedy curious to see if you were able to get a working example going? Thanks

Not yet, right now I’m trying to figure out what neutralizer to use. Do I go with GPT-3 (via the edits endpoint), GPT-3.5 or GPT-4. That seems to be the first step, but playing around with the edits endpoint today for the first time.

Interesting what does the edit endpoint help you achieve for this ?

Yeah, right? I wasn’t sure, but gave it a shot and it seems to work. Here is a Playground to share with you that does a decent job at neutralizing. I think the edit endpoint is overlooked.

Playground to form the raw data that I am going to color with the fine-tune:

oh wow! definitely going to read up on it, thanks for this!

1 Like

Hello, curt.kennedy.
I have also same issue with this.
Is it possible to implement your topic?