Side effects of training on open source code (A loveable Tutor)

magicpixie · September 16, 2021, 2:21am

I always start my prompts with a multi-line comment in python followed by whatever libraries I want to use. When I say ask, I technically don’t ask, I tell it what the “library” does similar to how you’d see in a codebase. I’ve often had prompts trail off into massive single line comments, not necessarily a bad thing, I’m sure that could be filtered using logprobs but I’m just starting to learn how those work. But when I “asked” codex to write code that finds the similarity between two sentences, I got several results that wrote some of the code then commented on exactly what else needs to be done and why.

I posted three responses and the prompts I used as a gist here’s the link.
Edit: fixed the link

It made me think that fine-tuning on open source repos could have given it a friendlier, more helpful tone since it would mimic the voices of comments in the repos and by extension, the open source community as a whole.

veered · September 29, 2021, 4:59pm

I’m also experiencing the massive, non-sensical single line comments. It ends up derailing the whole completion. Any luck on working around that? In the playground, I’ve been pausing the completion and manually removing the comment but that isn’t feasible for general use.

Edit: I’m seeing this happen with both Python and Javascript.

pappachuck · September 29, 2021, 6:22pm

Welcome to the club.
Sometimes it works, sometimes it make your life difficult.

I have obtained good and bad results with same input.

Topic		Replies	Views
Codex not working properly API codex	3	2289	July 9, 2022
Codex completion returns lines of text of an arbitrary width Prompting	3	887	February 18, 2022
Share Codex experiments API codex	10	1424	September 20, 2021
Keeping Codex on track API codex	15	1167	October 21, 2021
OpenAI and StackOverflow Prompting	3	2184	September 17, 2023

Side effects of training on open source code (A loveable Tutor)

Related topics