Side effects of training on open source code (A loveable Tutor)

I always start my prompts with a multi-line comment in python followed by whatever libraries I want to use. When I say ask, I technically don’t ask, I tell it what the “library” does similar to how you’d see in a codebase. I’ve often had prompts trail off into massive single line comments, not necessarily a bad thing, I’m sure that could be filtered using logprobs but I’m just starting to learn how those work. But when I “asked” codex to write code that finds the similarity between two sentences, I got several results that wrote some of the code then commented on exactly what else needs to be done and why.

I posted three responses and the prompts I used as a gist here’s the link.
Edit: fixed the link

It made me think that fine-tuning on open source repos could have given it a friendlier, more helpful tone since it would mimic the voices of comments in the repos and by extension, the open source community as a whole.

3 Likes

I’m also experiencing the massive, non-sensical single line comments. It ends up derailing the whole completion. Any luck on working around that? In the playground, I’ve been pausing the completion and manually removing the comment but that isn’t feasible for general use.

Edit: I’m seeing this happen with both Python and Javascript.

Welcome to the club.
Sometimes it works, sometimes it make your life difficult.

I have obtained good and bad results with same input.