Who is still using Codex?

Hi all!

I was wondering who is still using Codex, and why?
I feel unmotivated to continue making experiments with it as OpenAI insists on keeping radio-silence about codex. For me its an abandoned product.
It seems nowadays the new chat models are quite good at code, probably much better than codex.

So, is there a reason to continue using this?
help me understand :slight_smile:
thanks

1 Like

Yes, I am still using Codex.
Because, it’s free and max token is 8001.
Also, the response generated by Codex is enough for my scenario.

As I recall, the GItHub Copilot VSC extension uses codex so this is a huge user base of codex model users (millions of users).

:slight_smile:

See also:

See also:

How many users does [Copilot] even have though? The Visual Studio Code extension has been installed nearly 3.7 million times, while the Visual Studio tool has been installed nearly 154,000 times. According to GitHub, more than 1.2 million developers used Copilot’s technical preview in the past 12 months as of September, 2022. I wonder what the numbers are at now?

2 Likes

I mostly use Codex. It’s the least “creative” of the models, which is actually very handy for when you just want it to output something specific.

GPT-4 might be better but it’s also infinite times the price. ChatGPT 3.5 has a higher hallucination rate. The others are not as good at this specifically.

1 Like

thanks for your replies, good to see codex still has fans :slight_smile:

I really want to keep using it and build things with it, but I have little confidence in doing so due to a lack of communication by OpenAI. For all we know, tomorrow they can pull the plug on Codex and tell you to use chatgpt because it understands your instructions better…

No.

There is close to zero chance OpenAI will pull the plug on their very successful codex model which drives one of their most successful products, GitHub Copilot owned by Microsoft with at least 2M users a month a growing :rocket:

:slight_smile:

1 Like

“pull the plug” can take other forms… like making it a paid product at gpt4 cost… (we don’t expect it to be free forever, right?)

We’re still using it because it’s free, though some of our recent testing has shown that using the ChatGPT API with some careful prompting can lead to results that are as accurate or better than Codex for text-to-SQL translation.

thanks for sharing this, quite insightful. Yes, I would guess chatGpt would work best as it is based on the instruct series and text-to-sql is kind of like an instruction. IIRC, the “text to code” sample in the playground also uses davinci-003 instead of codex, for that reason.

But I’m curious if we could get better results by changing the prompt a bit. Can you share the original full prompt?

1 Like

Sure! Here’s the prompt we’ve used the most with Codex:

-- language PostgreSQL
-- schema:
{schema}
-- be sure to properly format and quote identifiers.
-- A postgreSQL query to SELECT 1 and 
-- a syntactically-correct PostgreSQL query to {user_prompt}
SELECT 1;

and with ChatGPT

You are a SQL code translator. Your role is to translate
natural language to PostgreSQL. Your only output should be SQL code.
Do not include any other text. Only SQL code.

Use the following PostgreSQL database schema:

{schema}

Convert the following to syntactically-correct PostgreSQL query: {user_prompt}.

Where, in both cases, {user_prompt} is filled in with the user’s question or instruction for the database, and {schema} is filled in with some high-level details about the database (schema names, table names, column names, column types)

would you be able to give me the full filled-out prompt for the case where codex failed? I’d like to give it a go to get it working :slight_smile: (private msg if necessary)

1 Like

Oh, understood!

-- Language PostgreSQL
-- schema: 
-- Table = "actor", columns = [actor_id integer, first_name character varying, last_name character varying, last_update timestamp without time zone]
-- Table = "address", columns = [address_id integer, address character varying, address2 character varying, district character varying, city_id smallint, postal_code character varying, phone character varying, last_update timestamp without time zone]
-- Table = "category", columns = [category_id integer, name character varying, last_update timestamp without time zone]
-- Table = "city", columns = [city_id integer, city character varying, country_id smallint, last_update timestamp without time zone]
-- Table = "country", columns = [country_id integer, country character varying, last_update timestamp without time zone]
-- Table = "customer", columns = [customer_id integer, store_id smallint, first_name character varying, last_name character varying, email character varying, address_id smallint, activebool boolean, create_date date, last_update timestamp without time zone, active integer]
-- Table = "film", columns = [film_id integer, title character varying, description text, release_year integer, language_id smallint, rental_duration smallint, rental_rate numeric, length smallint, replacement_cost numeric, rating USER-DEFINED, last_update timestamp without time zone, special_features ARRAY, fulltext tsvector]
-- Table = "film_actor", columns = [actor_id smallint, film_id smallint, last_update timestamp without time zone]
-- Table = "film_category", columns = [film_id smallint, category_id smallint, last_update timestamp without time zone]
-- Table = "inventory", columns = [inventory_id integer, film_id smallint, store_id smallint, last_update timestamp without time zone]
-- Table = "language", columns = [language_id integer, name character, last_update timestamp without time zone]
-- Table = "payment", columns = [payment_id integer, customer_id smallint, staff_id smallint, rental_id integer, amount numeric, payment_date timestamp without time zone]
-- Table = "rental", columns = [rental_id integer, rental_date timestamp without time zone, inventory_id integer, customer_id smallint, return_date timestamp without time zone, staff_id smallint, last_update timestamp without time zone]
-- Table = "staff", columns = [staff_id integer, first_name character varying, last_name character varying, address_id smallint, email character varying, store_id smallint, active boolean, username character varying, password character varying, last_update timestamp without time zone, picture bytea]
-- Table = "store", columns = [store_id integer, manager_staff_id smallint, address_id smallint, last_update timestamp without time zone]
-- be sure to properly format and quote identifiers.
-- A postgreSQL query to SELECT 1 and 
-- a syntactically-correct PostgreSQL query to What are the names of all the action films?
SELECT 1;

This template worked for 9/10 prompts in our (very small) test suite but failed for this one, where it returned

SELECT title FROM film WHERE category_id = 1

The codex model in particular is quite sensitive to changes in the prompt. It’s fairly easy to evoke the “right answer” for this prompt, but doing so can mess up some of the others (e.g. removing the admonition to correctly quote identifiers results in two or three of the tests failing).

1 Like

well I could get it to work using a one-shot in the prompt, and making it breakdown the task in two steps :slight_smile:

I’ve gotten it to work with a few variations, but a lot of those variations break other tests in the test suite. My experience overall has been that it’s a little less forgiving/a little more variable than the ChatGPT API.

Can you share the prompt that you ultimately got to work? I’d be interested to see if it works well with some of the other test cases. Thanks!

...
-- Table = "store", columns = [store_id integer, manager_staff_id smallint, address_id smallint, last_update timestamp without time zone]
-- be sure to properly format and quote identifiers.
###
-- Instruction: list all spanish cities
-- Query: select all cities that are from the country spain
SELECT city FROM city WHERE country_id = (SELECT country_id FROM country WHERE country = 'Spain');
###
-- Instruction:: What are the names of all the action films
-- Query: select all films that are in the action category
SELECT title FROM film WHERE film_id IN (SELECT film_id FROM film_category WHERE category_id = (SELECT category_id FROM category WHERE name = 'Action'));

my prompt ended after the second “Query:” so I let it workout the better english description, before the actual query.

let me know how it works out for the other tests :slight_smile:

1 Like

Thanks! I haven’t tried that kind of template before. I’ll give it a try and report back.

For all we know, tomorrow they can pull the plug on Codex and tell you to use chatgpt because it understands your instructions better…


this thread aged well

5 Likes

Just got an email confirming that Codex is being discontinued as well.

2 Likes

Yeah, looks like Copilot will transition to GPT-4?

This is something which has caught me by surprise.

:slight_smile:

2 Likes

I received mail from OpenAI that they are discontinuing support for Codex. Is it correct?

1 Like