Model outputting incorrect statistical data

I’ve recently published “MLB Stats” to the Plugin Store that retrieves up-to-date baseball statistics. For some of my queries, the model outputs different numbers from what it is retrieving from the plugin. I’m curious if anyone else has faced a similar problem and what you did to fix it. All data being outputted from the plugin is JSON with clear descriptions in openapi.yaml

Thanks. Just updated the title. I just need it to output and interpret the data verbatim from what the plugin gives.

Player’s batting average: .248

Sometimes the ChatGPT outputs : .256

Not sure why.

Can you provide an example of a full response from the plugin and your prompt? It’s hard to determine where ChatGPT might be pulling/generating that information, or why, without that information. The most common cause is that information exists elsewhere and it happens to hold a heavier weight in the context for whatever reason.

You bring up an interesting point about “weights” based on other information out there. I’m seeing errors in output a lot when trying to get top prospect information because the model gets the top prospects from 2 years ago before the cutoff.

Example prompt: Get the combined team batting performance for the Atlanta Braves

Plugin output:

[
  {
    "teamIDfg": 16,
    "Season": 2023,
    "Team": "ATL",
    "Age": 29,
    "G": 1250,
    "AB": 3054,
    "PA": 3401,
    "H": 827,
    "1B": 496,
    "2B": 154,
    "3B": 8,
    "HR": 169,
    "R": 499,
    "RBI": 480,
    "BB": 294,
    "IBB": 10,
    "SO": 728,
    "HBP": 30,
    "SF": 21,
    "SH": 0,
    "GDP": 76,
    "SB": 70,
    "CS": 12,
    "AVG": 0.271,
    "GB": 1033,
    "FB": 871,
    "LD": 439,
    "IFFB": 63,
    "Pitches": 13384,
    "Balls": 4831,
    "Strikes": 8553,
    "IFH": 66,
    "BU": 4,
    "BUH": 3,
    "BB%": 0.086,
    "K%": 0.214,
    "BB/K": 0.4,
    "OBP": 0.339,
    "SLG": 0.492,
    "OPS": 0.831,
    "ISO": 0.222,
    "BABIP": 0.302,
    "GB/FB": 1.19,
    "LD%": 0.187,
    "GB%": 0.441,
    "FB%": 0.372,
    "IFFB%": 0.072,
    "HR/FB": 0.194,

( etc.)

Sometimes the model outputs the batting average as .275

I have noticed improvements from using the plugin more.

Am I correct in assuming the batting average is 0.271? It’s likely that the acronyms aren’t helping you out much either, unless you’re providing context for what those acronyms mean

Yes you’re correct. I’m trying to provide good descriptions to support the data too.

getTeamBattingCombinedResponse:
      type: object
      properties:
        Team:
          type: string
          description: >
            The team's 3 letter abbreviation. All subsequent properties within the object 
            are the team's combined batting statistics. 
        teamIDfg:
          type: integer
        Season:
          type: integer
        Age:
          type: integer
          description: The average age of the team
        G: 
          type: integer
        AB: 
          type: integer
          description: The combined total number of at-bats for the team
        PA:
          type: integer
          description: The combined total number of plate appearances for the team
        H:
          type: integer
          description: The combined total number of hits for the team
        1B:
          type: integer
          description: The combined number of singles for the team
        2B:
          type: integer
          description: The combined number of doubles for the team
        3B:
          type: integer
          description: The combined number of triples for the team
        HR:
          type: integer
          description: The combined number of home runs for the team
        R: 
          type: integer
          description: The combined number of runs for the team
        RBI:
          type: integer
          description: The combined number of runs batted in for the team
        BB:
          type: integer
          description: The combined number of base on balls - also known as walks - for the team
        IBB:
          type: integer
          description: The combined number of intentional walks for the team
        SO:
          type: integer
          description: The combined number of strikeouts for the team
        HBP:
          type: integer
          description: The combined number of times players have been hit by a pitch
        SLG:
          type: number
          description: The team's slugging percentage
        OPS:
          type: number
          description: The team's On-base plus slugging percentage
        AVG:
          type: number
          description: The team's combined batting average
        GB: 
          type: number
          description: The combined total number of ground balls
        FB:
          type: number
          description: The combined total number of fly balls
        LD: 
          type: number
          description: The combined total number of line drives`

I would, perhaps, consider re-structuring the JSON you send back.

The first thing I would do would be to replace the abbreviations and acronyms with meaningful descriptors to make it easier for ChatGPT to parse out what it needs to get.

The second thing I would do would be to nest things where appropriate.

So, your returned JSON might look something like:

[
  {
    "teamIDfg": 16,
    "season": 2023,
    "team": "ATL",
    "age": 29,
    "games": 1250,
    "runs": 499,
    "runs_batted_in": 480,
    "plate_appearances": {
      "total": 3401,
      "at_bats": {
        "total": 3054,
        "hits": {
          "total": 827,
          "singles": 496,
          "doubles": 154,
          "triples": 8,
          "home_runs": 169
        },
        "bases_on_balls":{
          "total":  294,
          "intentional_bases_on_balls": 10
        },
        "strike_outs": 728,
        "hit_by_pitch": 30
      }
    }
  }
]

I think that is something ChatGPT will be much more able to work with.

1 Like

Thank you ! This seems like a good approach. I will try this

1 Like

Please update with results when you have them.

I ended up trying something that I used on another one of my functions that I was having inaccurate output from GPT on. I liked your method @elmstedt , but I would’ve had to restructure basically everything since all my functions are using the common baseball acronyms. What worked for me is just an additional layer of filtering and even SORTING.

My openapi.yaml changes:

/team_batting_combined:
        get:
            operationId: getTeamBattingCombined
            summary: >
              Retrieves the combined batting statistics for all teams across the MLB from Fangraphs for the specified season.
              This function should be used whenever a prompt is asking for combined statistics. 
            parameters:
              - in: query
                name: year
                required: true
                schema: 
                  type: integer
                description: The year from which the batting statistics should be retrieved from
              - in: query
                name: team_abbreviation
                required: false
                schema:
                  type: string
                description: >
                  The 3 letter abbreviated name of the baseball team. This parameter should be passed 
                  to get the combined statistics for a specific team. 
                example: NYY
              - in: query 
                name: batting_stat
                required: false
                schema: 
                  type: string
                  enum: ['H', '2B', '3B', 'HR', 'RBI', 'BB', 'IBB', 'SO', 'HBP', 'SH', 'SF', 'GDP', 'SB', 'CS', 'AVG', 
                      'OBP', 'SLG', 'OPS']
                description: Can be used to filter based on a certain batting statistic.

I’ve added ‘batting_stat’ as an additional optional query parameter acting as a filter to only get that specific stat if needed. I’ve found through my testing on my other functions too that ChatGPT, as you mentioned, might get confused on the many acronyms and data it is getting thrown.

(I made some modifications in my code which I can send as well if interested)

Inaccurate example output before the changes (data isn’t current with 2023 season or is just made up…not sure):

After the changes (all data is accurate):

This is what the JSON data now being retrieved from the plugin looks like:

[
  {
    "Team": "TEX",
    "AVG": 0.274
  },
  {
    "Team": "ATL",
    "AVG": 0.271
  },
  {
    "Team": "MIA",
    "AVG": 0.265
  },
  {
    "Team": "BOS",
    "AVG": 0.264
  },
  {
    "Team": "WSN",
    "AVG": 0.261
  },
  {
    "Team": "TOR",
    "AVG": 0.259
  },
  {
    "Team": "TBR",
    "AVG": 0.259
  },
  {
    "Team": "ARI",
    "AVG": 0.258
  },
  {
    "Team": "PHI",
    "AVG": 0.258
  },
  {
    "Team": "CIN",
    "AVG": 0.257
  },
  {
    "Team": "LAA",
    "AVG": 0.256
  },
  {
    "Team": "COL",
    "AVG": 0.255
  }

Sorting seems to help the model understand better too

1 Like