Categorize Geographical Locations Under Common Types

I’m making an app where a user can enter different types of locations and specify what kind of location it is (mountain, forest, city, farm, etc). Also, the user can use GPT to have it generate locations for them, which will return any random kind of location. There’s a page in the app where the user can filter on the different locations, and right now I’m just listing every different kind as its own filter.

If the user or GPT creates a bunch of locations with different types, there will be a ton of filters. Moreover, GPT sometimes creates similar locations with different types (mountain vs mountain range). I want to prompt GPT to take all the current location types and group them together under similar categories. If I do this through ChatGPT, I get pretty good results (see below example). However, if I do it through my app with the API, I get very varied and unreliable results. Both are using GPT-4o, so I’m not sure what I need to do to make it work. See below for an examples of what the API is returning.

Prompt

Given the following list of specific locations, return a mapping of common locations to them. For instance, if the list contains both "mountain" and "mountain range", put those both under "mountain". However, don't make the mapping too generalized. For instance, don't put "ocean" and "river" under a single "water" category. Return just mapping as a JSON array in the following format: 

{ 
    "broad_type": [ "specific_type", ... ],
     ...
}

List of specific locations: Canyon, Castle, Cave, Cave System, City, Desert, Farm, Forest, Island, Lake, Mountain, Ocean, River, Swamp, Temple, Tower, Town, Valley, Village, Volcano

ChatGPT Response

{
    "canyon": ["Canyon"],
    "castle": ["Castle"],
    "cave": ["Cave", "Cave System"],
    "city": ["City"],
    "desert": ["Desert"],
    "farm": ["Farm"],
    "forest": ["Forest"],
    "island": ["Island"],
    "lake": ["Lake"],
    "mountain": ["Mountain"],
    "ocean": ["Ocean"],
    "river": ["River"],
    "swamp": ["Swamp"],
    "temple": ["Temple"],
    "tower": ["Tower"],
    "town": ["Town"],
    "valley": ["Valley"],
    "village": ["Village"],
    "volcano": ["Volcano"]
}

API Response 1 (no good)

{
  "natural feature": [
    "Canyon",
    "Cave",
    "Cave System",
    "Desert",
    "Forest",
    "Island",
    "Lake",
    "Mountain",
    "Ocean",
    "River",
    "Swamp",
    "Valley",
    "Volcano"
  ],
  "man-made structure": [
    "Castle",
    "Farm",
    "Temple",
    "Tower"
  ],
  "settlement": [
    "City",
    "Town",
    "Village"
  ]
}

Response 2 (acceptable)

{
  "canyon": ["Canyon"],
  "castle": ["Castle"],
  "cave": ["Cave", "Cave System"],
  "city": ["City"],
  "desert": ["Desert"],
  "farm": ["Farm"],
  "forest": ["Forest"],
  "island": ["Island"],
  "lake": ["Lake"],
  "mountain": ["Mountain"],
  "ocean": ["Ocean"],
  "river": ["River"],
  "swamp": ["Swamp"],
  "temple": ["Temple"],
  "tower": ["Tower"],
  "town": ["Town"],
  "valley": ["Valley"],
  "village": ["Village"],
  "volcano": ["Volcano"]
}

How can I modify the prompt to give me consistently reliable results like the second response?