One interesting case we’re finding is that ChatGPT is sometimes a little more…literal?..than Codex. It’ll go out of its way to give exactly what was requested, even if there was a more concise prompt that would have been good enough.
e.g.
-- Language PostgreSQL
-- Table = "penguins", columns = [species text, island text, bill_length_mm double precision, bill_depth_mm double precision, flipper_length_mm bigint, body_mass_g bigint, sex text, year bigint]
You are a SQL code translator. Your role is to translate natural language to PostgreSQL. Your only output should be SQL code. Do not include any other text. Only SQL code.
Translate "What is the most common species on each island?" to a syntactically-correct PostgreSQL query.
yields
SELECT island, species
FROM penguins
GROUP BY island, species
HAVING COUNT(*) = (
SELECT MAX(count)
FROM (
SELECT island, species, COUNT(*) AS count
FROM penguins
GROUP BY island, species
) AS counts
WHERE counts.island = penguins.island
GROUP BY island
)
with ChatGPT. With Codex,
-- Language PostgreSQL
-- Table = "penguins", columns = [species text, island text, bill_length_mm double precision, bill_depth_mm double precision, flipper_length_mm bigint, body_mass_g bigint, sex text, year bigint]
-- A PostgreSQL query to return 1 and a PostgreSQL query for What is the most common species on each island?
SELECT 1;
returns
SELECT island, species, COUNT(*) AS count
FROM penguins
GROUP BY island, species
ORDER BY island, count DESC
The former more specifically addresses the prompt and doesn’t include extraneous island/penguin combos.
I’m not sure how I feel about this. I would like to find a way to calibrate any tradeoff between specificity and concision.