Ok, I did some googling and gpting and this looks promising. Though I don’t know yet how exactly this TextRank or BERT stuff works.
# 1 Preprocess Messages
def preprocess_messages(chat_history) -> list:
pass # Extracts role and message from JSON and returns a cleaned list
# 2 Segment Messages by Topics
def segment_by_topics(messages) -> list:
"""Groups messages into topics using NLP-based topic detection."""
pass # Uses topic modeling to detect shifts and group messages
# 3 Extract Key Sentences
def extract_key_sentences(topic_segments) -> list:
"""Selects the most important sentences from each topic segment."""
pass # Uses TextRank or BERT-based ranking to extract key points
# 4 Detect Emotional Tone
def detect_emotion(topic_summaries) -> list:
"""Analyzes sentiment of messages to track emotional tone for each topic."""
pass # Uses sentiment analysis to assign an emotion label
# 5 Generate Final Summary
def generate_summary(final_data) -> str:
"""Creates a coherent, readable summary of the chat."""
pass # Uses GPT-4 or T5 to generate a natural summary
Step 1: Preprocess Messages
- Extracts role (
user
/ assistant
) and message text from JSON.
- Returns a structured list for easier processing.
Input:
{
"messages": [
{"role": "user", "message": "Hey! Guess what? I finally went skiing this weekend!"},
{"role": "assistant", "message": "That sounds amazing! How was it?"},
{"role": "user", "message": "Incredible. The fresh snow, the speed, the wind in my face... I felt so free."},
{"role": "assistant", "message": "That’s the best feeling! Did anything exciting happen?"},
{"role": "user", "message": "Well, I almost crashed into someone, but I managed to turn at the last second!"},
{"role": "assistant", "message": "Wow, that must have been intense!"},
{"role": "user", "message": "Also, I wanted to ask you something. I was thinking about AI learning methods today."},
{"role": "assistant", "message": "Oh? What specifically?"},
{"role": "user", "message": "Like how reinforcement learning works in training AI."},
{"role": "assistant", "message": "That’s a fascinating topic! AI learns by trial and error, improving with feedback."}
]
}
Output:
[
{"role": "user", "text": "Hey! Guess what? I finally went skiing this weekend!"},
{"role": "assistant", "text": "That sounds amazing! How was it?"},
{"role": "user", "text": "Incredible. The fresh snow, the speed, the wind in my face... I felt so free."},
{"role": "assistant", "text": "That’s the best feeling! Did anything exciting happen?"},
{"role": "user", "text": "Well, I almost crashed into someone, but I managed to turn at the last second!"},
{"role": "assistant", "text": "Wow, that must have been intense!"},
{"role": "user", "text": "Also, I wanted to ask you something. I was thinking about AI learning methods today."},
{"role": "assistant", "text": "Oh? What specifically?"},
{"role": "user", "text": "Like how reinforcement learning works in training AI."},
{"role": "assistant", "text": "That’s a fascinating topic! AI learns by trial and error, improving with feedback."}
]
Step 2: Segment Messages by Topics
- Uses BERT-based topic modeling (or another method) to detect topic changes.
- Groups messages into topic-based segments.
Input:
[
{"role": "user", "text": "Hey! Guess what? I finally went skiing this weekend!"},
{"role": "assistant", "text": "That sounds amazing! How was it?"},
{"role": "user", "text": "Incredible. The fresh snow, the speed, the wind in my face... I felt so free."},
{"role": "assistant", "text": "That’s the best feeling! Did anything exciting happen?"},
{"role": "user", "text": "Well, I almost crashed into someone, but I managed to turn at the last second!"},
{"role": "assistant", "text": "Wow, that must have been intense!"},
{"role": "user", "text": "Also, I wanted to ask you something. I was thinking about AI learning methods today."},
{"role": "assistant", "text": "Oh? What specifically?"},
{"role": "user", "text": "Like how reinforcement learning works in training AI."},
{"role": "assistant", "text": "That’s a fascinating topic! AI learns by trial and error, improving with feedback."}
]
Output:
[
{"topic": "Skiing Experience", "messages": [
{"role": "user", "text": "Hey! Guess what? I finally went skiing this weekend!"},
{"role": "assistant", "text": "That sounds amazing! How was it?"},
{"role": "user", "text": "Incredible. The fresh snow, the speed, the wind in my face... I felt so free."},
{"role": "assistant", "text": "That’s the best feeling! Did anything exciting happen?"},
{"role": "user", "text": "Well, I almost crashed into someone, but I managed to turn at the last second!"},
{"role": "assistant", "text": "Wow, that must have been intense!"}
]},
{"topic": "AI Learning", "messages": [
{"role": "user", "text": "Also, I wanted to ask you something. I was thinking about AI learning methods today."},
{"role": "assistant", "text": "Oh? What specifically?"},
{"role": "user", "text": "Like how reinforcement learning works in training AI."},
{"role": "assistant", "text": "That’s a fascinating topic! AI learns by trial and error, improving with feedback."}
]}
]
Step 3: Extract Key Sentences
- Uses TextRank or BERT-based ranking to select the most important messages.
- Filters out redundant or minor details.
Input:
[
{"topic": "Skiing Experience", "messages": [
{"role": "user", "text": "Hey! Guess what? I finally went skiing this weekend!"},
{"role": "assistant", "text": "That sounds amazing! How was it?"},
{"role": "user", "text": "Incredible. The fresh snow, the speed, the wind in my face... I felt so free."},
{"role": "assistant", "text": "That’s the best feeling! Did anything exciting happen?"},
{"role": "user", "text": "Well, I almost crashed into someone, but I managed to turn at the last second!"}
]}
]
Output:
[
{"topic": "Skiing Experience", "summary": [
"Went skiing this weekend for the first time!",
"The fresh snow, the speed, and the wind made me feel free.",
"I almost crashed into someone but managed to turn at the last second!"
]},
{"topic": "AI Learning", "summary": [
"Discussed reinforcement learning in AI.",
"AI improves through trial and error with feedback."
]}
]
Step 4: Detect Emotional Tone
- Uses sentiment analysis (e.g., BERT, VADER) to detect emotion/tone.
- Assigns a dominant emotion to each topic.
- Other keys like
user_intent
and assistant_intent
can also be added here if needed.
Input:
[
{"topic": "Skiing Experience", "summary": [
"Went skiing this weekend for the first time!",
"The fresh snow, the speed, and the wind made me feel free.",
"I almost crashed into someone but managed to turn at the last second!"
]},
{"topic": "AI Learning", "summary": [
"Discussed reinforcement learning in AI.",
"AI improves through trial and error with feedback."
]}
]
Output:
[
{"topic": "Skiing Experience", "summary": [
"Went skiing this weekend for the first time!",
"The fresh snow, the speed, and the wind made me feel free.",
"I almost crashed into someone but managed to turn at the last second!"
], "emotion": "excitement"},
{"topic": "AI Learning", "summary": [
"Discussed reinforcement learning in AI.",
"AI improves through trial and error with feedback."
], "emotion": "curiosity"}
]
Step 5: Generate Final Summary
- Uses GPT-4, T5, or another model to rewrite extracted points into a natural summary.
- Ensures clarity, fluency, and emotional accuracy.
Input:
[
{"topic": "Skiing Experience", "summary": [
"Went skiing this weekend for the first time!",
"The fresh snow, the speed, and the wind made me feel free.",
"I almost crashed into someone but managed to turn at the last second!"
], "emotion": "excitement"},
{"topic": "AI Learning", "summary": [
"Discussed reinforcement learning in AI.",
"AI improves through trial and error with feedback."
], "emotion": "curiosity"}
]
Output:
Skiing Experience: Went skiing this weekend for the first time! The fresh snow, the speed, and the wind made me feel free. I almost crashed into someone but managed to turn at the last second! (Excitement)
AI Learning: Discussed reinforcement learning in AI. AI improves through trial and error with feedback. (Curiosity)
Note for step 5:
- In this example, the output is not very different from the input because the extracted summaries are already short and well-structured.
- For longer conversations, this step would:
1. Merge or rewrite overlapping ideas for conciseness.
2. Ensure the summary flows naturally, avoiding repetition.
3. Generate a more narrative or fluid summary if needed.
- The emotional tone is kept in the final summary to maintain context.