API / Calling Function - Same output as input

Hello everyone,

I’m currently working on a project where I use the OpenAI API to call specific functions. However, I’m running into a problem: Even though my Python code doesn’t show any errors, I always get back the same values from the API that I gave as input. I’m not sure what I’m doing wrong and would appreciate any help or insight.


This is what my Python code looks like, actually super simple:

import openai
import json

openai.api_key = "xx"


job_title_A = "Praktikum Finance - Corporate Controlling (m/w/d)"
job_title_B = "(Senior) Medical Director, TB & Infectious Diseases (Berlin)"


job_title_function = [
    {
        'name': 'extract_information_from_job_title',
        'description': 'You are an expert at analyzing job titles. Your job is to filter out the pure job title. Make sure that no information such as location, gender, type of employment, seniority level or other additional information is included when returning the job title.',
        'parameters': {
            'type': 'object',
            'properties': {
                'job_title': {
                    'type': 'string',
                    'description': 'You are an expert at analyzing job titles. Your job is to filter out the pure job title. Make sure that no information such as location, gender, type of employment, seniority level or other additional information is included when returning the job title.'
                }                
            }
        }
    }
]


job_titles = [job_title_A,job_title_B]
for sample in job_titles:
    response = openai.ChatCompletion.create(
        model = 'gpt-3.5-turbo',
        messages = [{'role': 'user', 'content': sample}],
        functions = job_title_function,
        function_call = 'auto'
    )

    # Loading the response as a JSON object
    json_response = json.loads(response['choices'][0]['message']['function_call']['arguments'])
    print(json_response)

json_response - output:

{'job_title': 'Praktikum Finance - Corporate Controlling (m/w/d)'}
{'job_title': '(Senior) Medical Director, TB & Infectious Diseases (Berlin)'}

As you can see, the output values are the same as the input values. I’ve tried a lot and just can’t find the error.


Here is the entire response:

print(response)

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1700082538,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "extract_information_from_job_title",
          "arguments": "{\n  \"job_title\": \"Praktikum Finance - Corporate Controlling (m/w/d)\"\n}"
        }
      },
      "finish_reason": "function_call"
    }
  ],
  "usage": {
    "prompt_tokens": 158,
    "completion_tokens": 31,
    "total_tokens": 189
  }
}

Thanks for any help!


UPDATE:
Goal:
I would like to prepare job titles so that only the pure title / description is returned.

Here are examples:
Job title unadjusted:

job_title_1 = "(Senior) HR Partner / (Senior) Personalreferent (m/w/d)"
job_title_2 = "Business Development Manager, OEM (San Francisco Bay Area, Pacific Northwest)"
job_title_3 = "3) Field Service Specialist (Maryland)"
job_title_4 = "Director, Head of R&D Gdansk (Enzymes)"
job_title_5 = "Praktikum Finance - Corporate Controlling (m/w/d)"
job_title_6 = "Senior Medical Director, TB & Infectious Diseases"

Job title cleaned up:

job_title_1 = "HR Partner / Personalreferent"
job_title_2 = "Business Development Manager, OEM"
job_title_3 = "Field Service Specialist"
job_title_4 = "Head of R&D Gdansk (Enzymes)"
job_title_5 = "Finance - Corporate Controlling"
job_title_6 = "Medical Director, TB & Infectious Diseases"

This means that I want to filter out all information that is not directly related to the job description. This could be, for example, the following:

  • Location [berlin, munich…]
  • Seniority [senior, mid…]
  • Gender [(m/f/d)…]
  • Employment type [internship, full-time, part-time…]
  • Special characters / numbers [3,–, #…]
  • and so forth…

If I do this in the chat on the website, it works somewhat. Terms like “internship” and “senior” are still included, but some things have been edited correctly.
(It’s not mega prompt either, but it works 100% better than via the API in my script)

I may be off track here but as a human it’s difficult to tell exactly what you’re asking to be stripped out from those specific examples (except Berlin is obvious) so it will be a challenge for AI.

Two suggestions:

  1. Run your prompt in the playground and see what results you get. If good, copy the code from the playground

  2. Use few shot learning in your prompt. Give it a few examples of what you want it to do so it can follow.

Hope this helps :slight_smile:

I tried your sample texts and I am having hard time to tell the AI to extract just the job title that I ended up including gender and location as parameters. This way, I got the AI to fill up the other info while leaving the job_title as is. Here is my modified function:

{
    "name": "extract_job_title",
    "description": "Extract job title and other information from given text",
    "parameters": {
        "type": "object",
        "properties": {
            "job_title": {
                "type": "string",
                "description": "Job title without gender or location information"
            },
            "gender": {
                "type": "string",
                "description": "Gender part of the job title. Leave blank if not specified."
            },
            "location": {
                "type": "string",
                "description": "Location part of the job title. Leave blank if not specified."
            }
        },
        "required": ["job_title", "gender", "location"]
    }
}

If I run this in the Playground, then titles are cleaned up properly.


This is the code used in the Playground.

# This code is for v1 of the openai package: pypi.org/project/openai
from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {
      "role": "system",
      "content": "You are an expert at analyzing job titles. Your job is to filter out the pure job title. Make sure that no information such as location, gender, type of employment, seniority level or other additional information is included when returning the job title."
    },
    {
      "role": "user",
      "content": "job_title_1 = \"(Senior) HR Partner / (Senior) Personalreferent (m/w/d)\""
    },
    {
      "role": "assistant",
      "content": "HR Partner / Personalreferent"
    },
    {
      "role": "user",
      "content": "job_title_B = \"(Senior) Medical Director, TB & Infectious Diseases (Berlin)\""
    },
    {
      "role": "assistant",
      "content": "Medical Director, TB & Infectious Diseases"
    },
    {
      "role": "user",
      "content": "job_title_2 = \"Business Development Manager, OEM (San Francisco Bay Area, Pacific Northwest)\""
    },
    {
      "role": "assistant",
      "content": "Business Development Manager, OEM"
    },
    {
      "role": "user",
      "content": "job_title_3 = \"3) Field Service Specialist (Maryland)\""
    },
    {
      "role": "assistant",
      "content": "Field Service Specialist"
    }
  ],
  temperature=1,
  max_tokens=256,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0
)

If I add the parameters into my code, nothing happens and I still get the input values as output values.


I’m adjusting my post to describe what I want in more detail. Thank you in advance.

Hi, I haven’t used the json option yet sothat might be where your issue is arising but not sure. I got it to work in my Colab notebook without json. Here is the code:

job_titles = [
    "(Senior) HR Partner / (Senior) Personalreferent (m/w/d)",
    "Business Development Manager, OEM (San Francisco Bay Area, Pacific Northwest)",
    "3) Field Service Specialist (Maryland)",
    "Director, Head of R&D Gdansk (Enzymes)",
    "Praktikum Finance - Corporate Controlling (m/w/d)",
    "Senior Medical Director, TB & Infectious Diseases"
]

for job in job_titles:
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "You are an expert at analyzing job titles. Your job is to filter out the pure job title. Make sure that no information such as location, gender, type of employment, seniority level or other additional information is included when returning the job title."
            },
            {
                "role": "user",
                "content": job
            }
        ],
        temperature=1,
        max_tokens=256,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )
    print(response.choices[0].message.content)

I hope this helps :slight_smile:
Mim

Many thanks! This is how it works when I only want to extract “one” piece of information.


How would you create the code if you want to query multiple pieces of information?

job_titles = [
    "(Senior) HR Partner / (Senior) Personalreferent (Teilzeit)",
    "Praktikum Finance - Corporate Controlling (Berlin)",
]

{
    "job_title_clean": 
    [
        {
            "job_title": "HR Partner / Personalreferent",
            "location": "",
            "employement_type": "teilzeit"
        },
        {
            "job_title": "Finance - Corporate Controlling",
            "location": "Berlin",
            "employement_type": "praktikum"
        }
    ]
}

You would have to use the function calling function using JSON, or send the content back to the API for every piece of information you want, or am I wrong here?

It’s not clear to me what you are trying to get it to do here - what do you mean ‘I only want…one piece of info.?’

If you have 100 or so example (if you have 1000, that’s better), I would suggest you look into Fine Tuning a model as it will learn more easily what you are trying to do, and give you more consistent answers, possibly also with a single call (depending on exactly what you’re trying to do).