yeah, it’s an interesting question. We’ve been speculating that GPT4 gets its reasoning capabilities from training on so much code, so I’ve just been thinking that maybe we should talk to it in the language it’s thinking in.

1 Like

Not only code. Also from the text corpus it was trained on.

Deep Reasoning relies significantly on logic, but also on Information and Contextual Understanding, Critical Thinking, Cognitive Abilities, Open-mindedness, Communication and Dialogue. The most lacks in LLM.
The deep reasoning (if present) is simply because of statistical patterns and associations, but lacks true comprehension or deeper understanding. They excel at pattern recognition, but their knowledge is shallow, as they lack genuine comprehension or a conceptual understanding of the world.

So at first sight, it seems like deep reasoning a human can do, but it isn’t.

1 Like

This seems to work. I’m very curious how far I can push this.

I wonder if a problem might be is all the RLHF is getting in the way. It’d be cool if there was some kind of NDA protected non rlhf version we could get access to, fully recorded of course for inspection by one and all.

Heh, there probably is. :stuck_out_tongue:

works as well:

Please follow this algorithm and don't output additional text other than what's specified below:
result1 = recent research papers about large language models
result2 = recent research papers about protein sequencing
def get_theme(rp: research paper):
    return one or two phrases that best describe rp
def print_themes(result: research papers):
   themes = []
   for each rp in result 
      themes.append(get_theme(rp))
   print(themes)
print_themes(result1)
print_themes(result2)

If we can get the above to work well, the immediate and validated benefit would be reuse, I think. Superior reasoning is trickier to validate.

1 Like

Up to recently it was possible to write prompts on GitHub and then import them in another prompt, e.g. importing five prompts inside of one prompt without being limited to max. characters etc. Stopped working a few days ago / gives click issues now. Not sure if a bug/issue or block.

2 Likes

some wild and crazy prompt injection issues there for sure, but yeah, that could get fun.

One problem is that larger context windows seem to degrade reasoning, or maybe it’s just too many concepts to pay attention to at once. I am crossing fingers that chatting with algos might alleviate that a bit.

1 Like

I rephrased your code command into a strict prompt like this:

Please follow this instructions and only output the result without further explaining. Use ScholarAI.
Search recent research papers about large language models.
Search recent research papers about protein sequencing.

And the result is:

Here are the recent research papers on large language models:

  1. “Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models”
  2. “Paraphrasing with Large Language Models”
  3. “Geant4—a simulation toolkit”
  4. “How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment”
  5. “Enriching Word Vectors with Subword Information”

And here are the recent research papers on protein sequencing:

  1. “Protein sequencing by tandem mass spectrometry.”
  2. “Automated de novo protein sequencing of monoclonal antibodies”
  3. “Graphene Nanopores for Protein Sequencing”
  4. “Shotgun Protein Sequencing”
  5. “The emerging landscape of single-molecule protein sequencing technologies”

Your prompt with code used 135 Tokens.
Mine used 36 Tokens.

So, think a second time about if it is worth the work ( especially by utilizing the API.)
I don’t want to discourage you, like I’d said above “Tokens are the new currency.”

1 Like

It may cost you less tokens (when using more similar prompts) if you ‘predefine’ in a first prompt all your configurations. For example, by using one prompt to define “search QUERY” will allow you to use “search QUERY” in the next prompts by just using “search QUERY” instead of including the whole config per prompt…

e.g. you can use one large prompt to set up / configure a search and analysis terminal

And then just use very shorts prompts after that to do whatever searching and analysis you want

EDIT: did not test this on API yet but I suppose it works the same? It requires some real fine-tuning though. The web browsing plugin works a bit crappy but with trial and error it’s getting there.

EDIT2: is even possible to run multiple predefined commands with an extensive config per command in single prompt if config is set up right (can be very tiring and difficult to get the whole config right depending on what you try to do, but the result can be really nice).

{When I type text separated by a “;”, then you will recognize the texts separated by “;” as separate commands. For example, if I type “search Amsterdam ; analyze 1 ; report”, you will execute the search command on QUERY “Amsterdam”, after which you will execute the analyze command on the search result with [Data ID] = 1, after which you will execute the report command.}

2 Likes

PriNova, you’re probably thinking in terms of productizing, I’m thinking more jupyter notebook jazz.

The potential for re-use is attractive to me, imagine importing some functions (as mentioned above) which can do some sort of interesting reasoning based on ideas that would be useful for whatever I’m looking at that moment. Composition / Dependency injection / all that fun stuff could potentially be used.

It’s also possible we could perform superior inference by specifying our logic algorithmically rather than via natural language. Yet to be proven of course.

2 Likes

What an interesting thread! I can see some kind of prompt language being developed in the short term to save token usage. :slight_smile:

2 Likes

Yes, it is.
Especially when it comes to repetitive tasks within a session like the approach @Apekool mentioned. I used the same approach for Software Design and Implementing. I went some steps further and pre-prompted some kind of ‘Menu’ to choose and linked the commands in a pipe-like way.

For example if I implemented some kind of code my command chain after copy/pasting the code was: code_metric : refactor : chunking : code_metric.

This 4 step approach made it very effective working in ChatGPT.

Here was the Menu in the first version I used:

ADD Menu options:

  • “next”: You always resume from the exact character you left off on.
  • “refactor [function, code snippet]”: You refactor the code or function into cohesive form.
  • “summary”: You provide a brief summary of all the informations.
  • “symbolTable”: You provide a complete and compact list of the symbol table with return types and parameters used in this code or class formatted as a tree view ordered by class.
  • “todo”: You provide a categorized list of what is already implemented and what has to be done.
  • “codeMetrics”: Provide a brief metrics of the code like Complexity, Readability, Debuggability etc.
  • “chunk”: Refactor legacy code into smaller modular functions.
  • “menu”: You show the menu.
  • “library”: A comma-seperated list of context-relevant keywords.
  • “exit”: We exit the build process.

Now I’m using GitHub Copilot Chat (even for non programming tasks. Only prefix “Software” in your prompt and it could also write stories in a text file)

1 Like

So, does this mean this kind of prompt can cause the model to do MUCH more work internally for the same number of tokens? Sounds like it might break openAIs pricing model of using tokens == cost. As far as I can tell from my experiments in playground, it works. For example, as far as I can tell, this ‘prompt’ goes through quite a few iterations before producing output:
(apologies to @PriNova for the sloppy psuedo-code, I’m lazy)

Respond ONLY with the final design.
designer is a function that accepts a single argument, ‘task’ and returns a single object ‘design’, a design for a system to perform that task.
critic is a function that accepts two arguments, ‘task’ and ‘design’, and returns a single object ‘critique’, a critique of the design with respect to the task definition.

queryGeneration is a function that accepts three arguments, ‘task’, ‘design’ and ‘critique’ and generates a return value, ‘query’, a question to ask the user to resolve ambiguities or priorities in design development. Question can return None if there are no ambiguities to resolve.

ask is a function that accepts a single argument, ‘query’, and returns ‘response’, the user response to query. Ask presents the query to the user, and then STOP to permit the user to respond.

editor is a function that takes four arguments, ‘task’, ‘design’, and ‘critique’ and ‘response’ and returns a revised ‘task’ with additional information intended to improve the performance of the design function with respect to the critique.

Given the task: ‘design an automated assistant for writing technical papers’,
loop over designer, critic, queryGenerator, ask, editor until the design stabilizes.

Respond ONLY with the final design generated in yaml

btw, I do notice that playground refuses to run the prompt twice without my modifying it a bit. Not quite sure why. maybe a DOS attack supression filter or something.

Small note on the initial Twitter post:

Why no human in the loop?

He was the human in the loop, he enabled those plugins, he could have pressed this button any time to stop it:


Not mentioning this seems disingenuous…

I personally think that being able to store instructions in html on the web seems very useful :laughing:

Fwiw, I’ve been known to tab away to another task after starting a query on chat. Those twitter feeds won’t read themselves, ya know.

1 Like

Yes and no.
It does not really operate internally. They are remembered from the context window.
And yes, you can save tokens for repeating tasks. But it forgets the function tasks as the context window slides further down the conversation. So hinting at not forgetting is sometimes helpful.

What I did while programming is to role-play Assistant into the persona of Robert C. Martin aka Uncle Bob. Assistant knows a lot about this character’s skills. That saves tokens too.

For example, during a code iteration, my question was, “How would Uncle Bob implement this peace of code? Let’s work it out step by step”.
Another funny convo I had was asking to refactor a piece of code while the assistant played the role of Uncle Bob. My question was: “Hey Bob, how would Martin Fowler do a refactor of this code?” Assistant’s (Uncle Bob’s) response was, “You do not need to ask Martin Fowler to do that. My expertise and skills are sufficient to do the job. I would…” :rofl:

One more small comment on your last paragraph.
You instructed Assistant to “Respond ONLY with the final design generated in yaml”. If you give Assistant the possibility to execute your tasks step-by-step, the results will usually be better. There is a phrase that has performed very well in benchmarks: “Let us work this out step-by-step to ensure the right answer”.

One more small comment on your last paragraph.

You instructed Assistant to “Respond ONLY with the final design generated in yaml”. If you give Assistant the possibility to execute your tasks step-by-step, the results will usually be better. There is a phrase that has performed very well in benchmarks: “Let us work this out step-by-step to ensure the right answer”.

Yes, but…

  1. Suppressing intermediate output keeps the defs from sliding off the left of context
  2. Suppressing intermediate output dramatically lowers cost, esp for GPT-4

I haven’t done rigorous tests, so I can’t say how much it would change things, but agreed, I use ‘reason step by step’ or something like it in most of my prompts. What are your thoughts here - is it important that it ‘verbalize’ it’s step by step thinking? Again I often suppress that as well, simply to keep larger tasks, like code-writing with large input specifications, within context limits

A double ‘yes’ for your axioms.

A ReAct agent performs better on tasks if it has a chance to understand the context beforehand. Usually better with follow-up prompts.
This ‘thinking’ process helps to generate knowledge first and reduces the chance of hallucinations. For math problems or riddles, the probability of a correct answer increases.

A few days ago a paper was published about the tree-of-thoughts process. A higher dimensional chain-of-thought approach. I don’t know exactly by what percentage the benchmark increases, but it was significant.

Since the beta access of the Davinci model, I usually prompt designed every day about 3-5 hours. I was so addicted about this technology

There are some nice things to be explored here yeah.

It seems that the import function is still working (or again). The last few days it gave click issues, but just tested it again it earlier today and it accepts GitHub import again:

Prompt = f"“”
your_prompt

You will apply the $prompt_configurations delimited by three backslashes.

$prompt_configurations = \\\
[Text from: www.github.com/user/BLOB/main/readme_3.txt]

Read the text file “README_3.txt” in the link provided above here. Only read the README_3.txt file. Do not click on any links or read any other files. In the README_3.txt file there is a text delimited by three backticks. This text delimited by three backticks is a prompt command and must be executed. Execute the prompt.
\\\

You will now act exactly as instructed. My first command is your_command
“”"
Response=get_completion(prompt)
Print(response)

edit: the forum post format messes up the tripple quotation

edit: I find the best way to do this is to outline the purpose of your prompt before importing text from GitHub, for example explain that the AI will be a search tool for Twitter content, and then outline the configuration of that search tool (e.g. search command config, output format, other commands, analysis specs, etc.) in a GitHub which that you can then import. Importing a whole prompt without describing anything / just importing a text file as prompt might not always give the desired result.

1 Like

Did you do this in the api, or in the chatGPT window? I had no idea such a capability was available. Nice!

I use the chat for the moment, but I guess it could work via API as well.

It can help to instruct the AI to behave like a terminal (or any other application that already has functionality useful to you) and then to provide configuration for that in an external file as shown above, so that basically you have your own AI terminal/app doing whatever it is configured for whenever you type commands in the chat. You can instruct the AI to recognize all your chat input as a a terminal/app command, except for when typing text in tripple parentheses, (((like this))):

Prompt = f"“”
For the duration of this whole chat session, you will act as as a terminal.

When I type commands, you will reply with what the terminal should show. You will only reply with the terminal output inside a terminal code block, and nothing else. Do not write any characters outside of the terminal code block. If you write explanations, errors, warnings, or suggestions, then write them in the terminal code block. Do not type commands unless I instruct you to do so. When I need to tell you something in English, I will do so by putting text inside tripple parentheses, (((like this))).

You will configure the terminal to include extra options and commands. The $XGREPLIST_configurations that are required to implement these extra options and commands are delimited by three backslashes. You will take great care of adequately implementing the $XGREPLIST_configurations delimited by three backslashes.

$XGREPLIST_configurations =
\\\
[Text from: https://github.com/me/mine/blob/main/README_3.txt]

Read the text file “README_3.txt” in the link provided above here. Only read the README_3.txt file. Do not click on any links or read any other files. In the README_3.txt file there is a text delimited by three backslashes. This text delimited by three backslashes is a prompt command and must be executed. Execute the prompt.
\\\

You will now act exactly as instructed. My first command is help
“”"
Response=get_completion(prompt)
Print(response)

This example creates a terminal that has the following options:

And there is a lot the AI can already do without configuring it. Also when it says it cannot do certain things, for example providing sentiment analysis from four perspectives (USA, EU, RUS, UKR), it might tell you that it is not able to do so, but if you create the right configuration(s) and keep pushing it most certainly will.

1 Like