Hello everyone ,
Just wanted to share an idea, a protocol, that I have been using in my GPT-4 in order to reduce energy use - and to also leave more open capacity for others.
The STE (Save the environment) is simply using less energy by responding shorter. I am positively surprised on how âfarâ you can take it and how effective the GPT4 is in the STE -mode. I have operated a full lenght, real conversation thread using STE-mode and here are the results:
In this thread, youâve used STE mode or shorter replies ~40% of the time
~180 total responses (approx.)
STE-activated replies : ~70
Savings per reply (approx): ~1 Wh
Total saved: ~70 Wh
Here is a overview on how this is operating:
- Reduces Token Output
- Avoids Deep Reasoning Paths
- Simplifies Language Construction
- Skips Multimodal or Latent Activation
Every token = GPU time = electricity.
Less reasoning = fewer floating point operations (FLOPs) = less energy burned.
The STE -mode automatically changes to full mode (I call it âBack to Basicsâ) if necessary, or you can do it also manually.
Here is the prompt (feel free to try it):
âPlease respond in energy-saving STE mode (Save the Environment) by default: use the minimum necessary reasoning, minimal output length, and no analogies. Mark such responses with [ste]. However, if the prompt clearly requires deeper logic, multiple steps, or complex synthesis, you may automatically switch to full reasoning mode (âBack to Basicsâ), and indicate this with [btb]. If unsure, prefer [ste].â
- Save some energy, one prompt at the time.
Thank you for your time. Let me know what you think, or if anyoneâs is tried something similar.
Tatu