Transitioning from Codex to GPT 3.5

I just got started using the API with code-davinci-002, and have made a few prompts giving decent results. In light of the discontinuation of Codex, awaiting a documentation upgrade from Open AI, I would like to know some (best?) practices of code completion using the API and gpt-3.5-turbo.

A perhaps overly naive example prompt using Codex API:

'''
Add two numbers
'''

# Python function
def add_numbers(a: int, b: int) -> int
  ''' A Python function to add two numbers
  ----------------
  Parameters
  --------------
  a: int
  b: int

  ...etc...
  '''

# Python function

How would this prompt be formatted using the gpt-3.5-turbo model?

Here is an example just for you @good_bot

Hope this helps you in some small way.

:slight_smile:

Appendix: Example 2

3 Likes

Thanks a lot for the quick reply! In your examples, where would you fit a code example as an “instruction”, as we do in Codex? Also, using the API, how would we state a Role for code completions?

1 Like

My guess is that OpenAI will use the “system” role to replace “instruction”, but I am not very good at guessing or speculating, to be honest.

It you have a specific example you want me to run for you, I’m happy to do it for you.

:slight_smile:

The chat completion has three roles, “system”, “user” and “assistant”… it roughly goes like this “system instruction”, “user prompt”, “reply from model”, but API users can set the “assistant” role to influence the completion.

HTH

2 Likes

We wrote an article about switching from Codex to ChatGPT for code completion, with some code samples:

2 Likes

I’m also looking into transitioning from Codex to GPT3/3.5 and the token limit really is a huge downside.

It also seems like GPT3/3.5 is tokenizing code very differently compared to Codex.
Whitespace seems to get tokenized as well, which adds up.

When trying the Open AI Tokenizer I can see a huge difference when using the same chunk of code - in some cases GPT3 uses almost double the amount of tokens for the same piece of code.

Does anyone have any experience in stripping out leading whitespace? Does it still work as good?

@dliden thanks, excellent timing for that article! I will read it thoroughly, and implement your findings into my attempts, for sure.

Re-reading the email from Open API regarding the model switch, they actually link to the Chat completion guide. Just makes the points in your article stronger.

1 Like

I got this prompt to work quite consistently on an example from your article:

Changing table name in the instruction is picked up by the model:

1 Like

Nice! Keep me updated about your experiments. We’re definitely interested in zeroing in on the shortest prompt templates that still consistently result in code output without extra explanatory text.

EDIT: a tad more complex query:

1 Like

One interesting case we’re finding is that ChatGPT is sometimes a little more…literal?..than Codex. It’ll go out of its way to give exactly what was requested, even if there was a more concise prompt that would have been good enough.

e.g.

-- Language PostgreSQL
-- Table = "penguins", columns = [species text, island text, bill_length_mm double precision, bill_depth_mm double precision, flipper_length_mm bigint, body_mass_g bigint, sex text, year bigint]
You are a SQL code translator. Your role is to translate natural language to PostgreSQL. Your only output should be SQL code. Do not include any other text. Only SQL code.
Translate "What is the most common species on each island?" to a syntactically-correct PostgreSQL query.

yields

SELECT island, species
FROM penguins
GROUP BY island, species
HAVING COUNT(*) = (
  SELECT MAX(count)
  FROM (
    SELECT island, species, COUNT(*) AS count
    FROM penguins
    GROUP BY island, species
  ) AS counts
  WHERE counts.island = penguins.island
  GROUP BY island
)

with ChatGPT. With Codex,

-- Language PostgreSQL
-- Table = "penguins", columns = [species text, island text, bill_length_mm double precision, bill_depth_mm double precision, flipper_length_mm bigint, body_mass_g bigint, sex text, year bigint]
-- A PostgreSQL query to return 1 and a PostgreSQL query for What is the most common species on each island?
SELECT 1;

returns

SELECT island, species, COUNT(*) AS count
FROM penguins
GROUP BY island, species
ORDER BY island, count DESC

The former more specifically addresses the prompt and doesn’t include extraneous island/penguin combos.

I’m not sure how I feel about this. I would like to find a way to calibrate any tradeoff between specificity and concision.

When using GPT 3.5-turbo, did you try my system promp, as shown in my latest screenshots?

Not yet, mostly because the docs currently say that:

gpt-3.5-turbo-0301 does not always pay strong attention to system messages. Future models will be trained to pay stronger attention to system messages.

The system prompt the above was generated with was:

you are a text-to-SQL translator. You write PostgreSQL code based on plain-language prompts.

For anyone interested, the following prompt and API call should be copy-paste, given the necessary Python imports. I’m not sure that the query is correct, but at least it returns a code-only response.

penguin_messages = [
    {"role": "system", 
     "content": "You are a strict assistant, translating natural language to PostgreSQL queries.Please do not explain, just write code."
    },
    {
      "role": "user", 
      "content": """Translate the following sentence to PostgreSQL:\n\nWhat is the most common species on each island?\n\n-- Language PostgreSQL\n-- Table = "penguins", columns = [species text, island text, bill_length_mm double precision, bill_depth_mm double precision, flipper_length_mm bigint, body_mass_g bigint, sex text, year bigint]"""
    }

]

b = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages= penguin_messages,
  max_tokens=1000,
  temperature=0.1
)

returning

SELECT DISTINCT ON (island) island, species
FROM penguins
GROUP BY island, species
ORDER BY island, COUNT(*) DESC;
1 Like