Fine-tuning chatgpt3.5, how to train long dialogue examples, need to improve the quality of multi-sentence responses

    • The official training set:
      {“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “What’s the capital of France?”}, {“role”: “assistant”, “content”: “Paris, as if everyone doesn’t know that already.”}]}
      {“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “Who wrote ‘Romeo and Juliet’?”}, {“role”: “assistant”, “content”: “Oh, just some guy named William Shakespeare. Ever heard of him?”}]}
      {“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “How far is the Moon from Earth?”}, {“role”: “assistant”, “content”: “Around 384,400 kilometers. Give or take a few, like that really matters.”}]}
  • How would you organize a long question and answer?

for example:
Me: What phone do you use?
User: Apple mobile phone, are you also using this?
Me: Yes, I think it’s very useful.

  • How do you write a training set if you have multiple sentences?

One does not need to constrain the questions users ask or the responses of the AI to single sentences or even single lines or paragraphs in the fine-tune examples. Complete verbose replies can be written out.

Use the \n character for line breaks and use other escaping of special characters within the strings.

Here is a long question a user might input:

User prompt, asking for coding

Code Snippet with two functions:

  def updateLabelText(self):
        font = self.label.font()
        available_width = self.width() - self.label.contentsMargins().left() - self.label.contentsMargins().right()
        self.short_message = (self.original_text[0:124].replace("\n", "¶")[:121].rstrip() + "...") \
            if len(self.original_text) > 125 else self.original_text.replace("\n", "¶")  # shorten
        if not self.parent() or self.size() == QSize(0, 0) or not self.isVisible() or not self.layout():
            return ""  # Not ready, return default
        truncated_text = self.short_message[:self.calculateTruncationPoint(font, available_width)]
        if not self.short_message == truncated_text:
            truncated_text += ".."
        self.label.setText(truncated_text)

    def calculateTruncationPoint(self, font, available_width):
        elide_marker = ".."
        font_metrics = QFontMetrics(font)
        if len(self.short_message) < 20:
            return len(self.short_message)
        text_width = font_metrics.horizontalAdvance(self.short_message)
        marker_width = font_metrics.horizontalAdvance(elide_marker)
        for i in range(18, 120, 2):
            truncated_text = self.short_message[:i] + elide_marker
            truncated_width = font_metrics.horizontalAdvance(truncated_text)
            if truncated_width + marker_width >= available_width or i > 110:
                return i - 2
        return 0

The above is methods of Python 3.8 and PyQt5 subclassed QWidget code to determine the amount of text that will fit within a widget’s QLabel horizontally. However it is quite slow at processing. You will implement a binary search algorithm for the truncation point search, and will also uses a self.last_length int variable to store in the subclass object to enhance the speed of later label text updates after resize events.

And if the AI couldn’t perform the task before, or if you want it to do it without the extensive instructions it required before, you train it how to answer with the assistant response it should give:

Example AI response

Sure, I can help you implement the binary search algorithm for the truncation point search and incorporate the self.last_length variable. Here’s the updated version of your code with these modifications:

class YourSubclass(QWidget):
    def __init__(self, parent=None):
        super().__init__(parent)
        self.label = QLabel(self)
        self.short_message = ""
        self.original_text = ""
        self.last_length = 0

    def updateLabelText(self):
        font = self.label.font()
        available_width = self.width() - self.label.contentsMargins().left() - self.label.contentsMargins().right()
        self.short_message = (self.original_text[0:124].replace("\n", "¶")[:121].rstrip() + "...") \
            if len(self.original_text) > 125 else self.original_text.replace("\n", "¶")  # shorten
        
        if not self.parent() or self.size() == QSize(0, 0) or not self.isVisible() or not self.layout():
            return ""  # Not ready, return default
        
        if self.width() != self.last_length:
            self.calculateTruncationPoint(font, available_width)
            self.last_length = self.width()
        
        truncated_text = self.short_message[:self.last_truncation_point]
        if not self.short_message == truncated_text:
            truncated_text += ".."
        self.label.setText(truncated_text)

    def calculateTruncationPoint(self, font, available_width):
        elide_marker = ".."
        font_metrics = QFontMetrics(font)
        if len(self.short_message) < 20:
            self.last_truncation_point = len(self.short_message)
            return
        
        text_width = font_metrics.horizontalAdvance(self.short_message)
        marker_width = font_metrics.horizontalAdvance(elide_marker)
        
        left = 18
        right = min(120, len(self.short_message))
        last_fit = -1
        
        while left <= right:
            mid = (left + right) // 2
            truncated_text = self.short_message[:mid] + elide_marker
            truncated_width = font_metrics.horizontalAdvance(truncated_text)
            
            if truncated_width + marker_width <= available_width:
                last_fit = mid
                left = mid + 1
            else:
                right = mid - 1
        
        if last_fit == -1:
            self.last_truncation_point = 0
        else:
            self.last_truncation_point = last_fit - 2

In this implementation, the calculateTruncationPoint method uses binary search to find the optimal truncation point that fits within the available width. The self.last_length variable is used to keep track of the widget’s width and determine if a recalculation of the truncation point is necessary after resize events. The self.last_truncation_point stores the last calculated truncation point, enhancing the speed of later label text updates.

Please integrate this code into your project, and ensure that you have PyQt5 properly set up in your environment.

And here is the single line of training example with system, user, assistant roles. (long enough to probably blow up the forum)

{“messages”: [{“role”: “system”, “content”: “You are DeveloperBot, a programming assistant.”}, {“role”: “user”, “content”: “text = 'Code Snippet with two functions:\\n\\n  def updateLabelText(self):\\n        font = self.label.font()\\n        available_width = self.width() - self.label.contentsMargins().left() - self.label.contentsMargins().right()\\n        self.short_message = (self.original_text[0:124].replace("\\n", "¶")[:121].rstrip() + "..."\\n            if len(self.original_text) > 125 else self.original_text.replace("\\n", "¶"))  # shorten\\n        if not self.parent() or self.size() == QSize(0, 0) or not self.isVisible() or not self.layout():\\n            return ""  # Not ready, return default\\n        truncated_text = self.short_message[:self.calculateTruncationPoint(font, available_width)]\\n        if not self.short_message == truncated_text:\\n            truncated_text += ".."\\n        self.label.setText(truncated_text)\\n\\n    def calculateTruncationPoint(self, font, available_width):\\n        elide_marker = ".."\\n        font_metrics = QFontMetrics(font)\\n        if len(self.short_message) < 20:\\n            return len(self.short_message)\\n        text_width = font_metrics.horizontalAdvance(self.short_message)\\n        marker_width = font_metrics.horizontalAdvance(elide_marker)\\n        for i in range(18, 120, 2):\\n            truncated_text = self.short_message[:i] + elide_marker\\n            truncated_width = font_metrics.horizontalAdvance(truncated_text)\\n            if truncated_width + marker_width >= available_width or i > 110:\\n                return i - 2\\n        return 0'”}, {“role”: “assistant”, “content”: “text = "Sure, I can help you implement the binary search algorithm for the truncation point search and incorporate the `self.last_length` variable. Here's the updated version of your code with these modifications:\\n\\n```python\\nclass YourSubclass(QWidget):\\n    def __init__(self, parent=None):\\n        super().__init__(parent)\\n        self.label = QLabel(self)\\n        self.short_message = ''\\n        self.original_text = ''\\n        self.last_length = 0\\n\\n    def updateLabelText(self):\\n        font = self.label.font()\\n        available_width = self.width() - self.label.contentsMargins().left() - self.label.contentsMargins().right()\\n        self.short_message = (self.original_text[0:124].replace(\"\\n\", '¶')[:121].rstrip() + '...') \\\n            if len(self.original_text) > 125 else self.original_text.replace(\"\\n\", '¶')  # shorten\\n        \\n        if not self.parent() or self.size() == QSize(0, 0) or not self.isVisible() or not self.layout():\\n            return ''  # Not ready, return default\\n        \\n        if self.width() != self.last_length:\\n            self.calculateTruncationPoint(font, available_width)\\n            self.last_length = self.width()\\n        \\n        truncated_text = self.short_message[:self.last_truncation_point]\\n        if not self.short_message == truncated_text:\\n            truncated_text += '..'\\n        self.label.setText(truncated_text)\\n\\n    def calculateTruncationPoint(self, font, available_width):\\n        elide_marker = '..'\\n        font_metrics = QFontMetrics(font)\\n        if len(self.short_message) < 20:\\n            self.last_truncation_point = len(self.short_message)\\n            return\\n        \\n        text_width = font_metrics.horizontalAdvance(self.short_message)\\n        marker_width = font_metrics.horizontalAdvance(elide_marker)\\n        \\n        left = 18\\n        right = min(120, len(self.short_message))\\n        last_fit = -1\\n        \\n        while left <= right:\\n            mid = (left + right) // 2\\n            truncated_text = self.short_message[:mid] + elide_marker\\n            truncated_width = font_metrics.horizontalAdvance(truncated_text)\\n            \\n            if truncated_width + marker_width <= available_width:\\n                last_fit = mid\\n                left = mid + 1\\n            else:\\n                right = mid - 1\\n        \\n        if last_fit == -1:\\n            self.last_truncation_point = 0\\n        else:\\n            self.last_truncation_point = last_fit - 2\\n```\\n\\nIn this implementation, the `calculateTruncationPoint` method uses binary search to find the optimal truncation point that fits within the available width. The `self.last_length` variable is used to keep track of the widget's width and determine if a recalculation of the truncation point is necessary after resize events. The `self.last_truncation_point` stores the last calculated truncation point, enhancing the speed of later label text updates.\\n\\nPlease integrate this code into your project, and ensure that you have PyQt5 properly set up in your environment."}]}