How do you get GPT4 to generate code for larger scale problems?

I thought I would share my high level algorithm for generating code with GPT4, and see if anyone has anything better:

This is my process:

  1. create specification, more technical the better.
  2. Paste in spec, ask “What are the prompts I need to ask GPT4 to get all the code for the component above?”
  3. Paste in spec + GPT4 provided prompts + “Please list all the files, with brief summaries and public apis you will generate using these prompts that you provided for the above specified component”
  4. Paste in spec + prompts + GPT4 provided files / public apis + “Please provide a high level description of the sequences and execution flow for component as described by the specification, prompts that you provided, and the files / public apis that you provided.”
  5. Review all artifacts provided so far, if not completely satisfied, clarify details in spec to better guide GPT4 and loop back to 1

Here’s a good link as well - An example of LLM prompting for programming More links regarding this topic appreciated.

Ping me if this kind of thing interests you. Would like to join/create a discord or something where we try to solve this together.

2 Likes

Ping me in messages (just click on my profile and then the email icon) if you’re interested in contributing.

I use this as my plan prompt for for assistant subtasks (lightly editted from a gpt suggestion, of course…)

First review the overall task plan above. Record your analysis using format: {‘Thought’: thoughts and analysis on assistant participation in task}
Second, develop a concise plan for the assistant’s participation. A concise plan is a clear, specific, and succinct outline including the key objectives, strategies, and action steps required to achieve a specific goal or outcome. A concise plan should be easy to understand, communicate, and implement, and should be flexible enough to adapt to changing circumstances or challenges. use the following format:
{‘Plan’: \n 1. <step 1>\n 2. <step 2>, …}
Note that there is already a system state class available for use by the assistant participant. Use any methods available in the task state class description, but do not assume any not explicitly listed.

1 Like

btw, another technique I use is to require all code to be generated with docstrings.
Then I extract docstrings and add that as a user message in the prompt of any module I want the code generation for another module to be aware of.

2 Likes

We had gpt read the code and add the docstrings . in any language with a robust AST(Abstract Syntaxt Tree) library this is straightforward , by having the ast library read your code present to gpt a small piece of code - that gpt can summarize in a docstring, following relevant convention for all mentioned classes/funcyions etc, restricting its attention to whatever is important for your to keep in the documentation, and then use AST library to surgically add the docstrings in the code.
As soon as you have that docstring documention, you can also have an indexed documentation that GPT itself can access when it needs minimal context about anything else it stumbles along and cant afford to read too many tokens without getting distracted/hit token limits.
So in summary you use gpt to write mininal code summarizing documentation, by breaking up your code files using existing ast/code parsing libs available for python/java/etc and then in second iteration you can leverage this from GPT to do more stuff with your code.
(we are testing this against our own codebase python/django with ~30 apps / 200 models)

2 Likes

Yes, comments are absolute key to code generation.

I posted a cool paper here about this - Foundational must read GPT/LLM papers - #59 by qrdl

With the image based capabilities, I can see a lot of fascinating possibilities here.

1 Like