Any luck getting specific word count?

I’m trying to get ChatGPT to summarize an article for me in 500 words. It keeps doing around 120… If anyone has had luck please let me know.

If you are giving a URL link for ChatGPT Plus to browse, the synopsis will be quite short, with a link to click more, by design.

You can paste the document in to get a more fulfilling summary.

“Give a long summary of this article”, (then paste article below instruction), to GPT 3.5. Results: 417 words.

The article discusses the ongoing innovations in AI processing, focusing on the necessity for full-stack advancements in both hardware and software to meet the increasing computational demands of neural networks. A significant strategy to enhance efficiency involves the adoption of lower precision number formats, such as 8-bit floating point precision, which can lead to improvements in computational efficiency, reduced memory usage, and optimized interconnect bandwidth.

The industry has progressed from 32-bit to 16-bit and now to 8-bit precision formats, with Transformer networks particularly benefiting from 8-bit floating point precision. The implementation of a common interchange format is highlighted as crucial for rapid advancements and interoperability between hardware and software platforms.

NVIDIA, Arm, and Intel collaborated on a whitepaper introducing an 8-bit floating point (FP8) specification, aiming to accelerate AI development by optimizing memory usage and catering to both AI training and inference. This FP8 specification offers two variants, E5M2 and E4M3, and is natively integrated into the NVIDIA Hopper architecture, demonstrating promising results in initial testing.

FP8 minimizes deviations from existing IEEE 754 floating point formats to strike a balance between hardware and software, facilitating adoption and improving developer productivity. The E5M2 variant utilizes five bits for the exponent and two bits for the mantissa, resembling a truncated IEEE FP16 format. Meanwhile, the E4M3 variant extends the representable range with a four-bit exponent and a three-bit mantissa for situations requiring more precision at the expense of numerical range.

Notably, the FP8 format utilizes only eight bits, saving computational cycles and eliminating the need for re-casting between precisions during AI training and inference. Testing has shown comparable accuracy to 16-bit precisions across various use cases, architectures, and networks, with significant speedups observed in transformers, computer vision, and GAN networks.

The article includes charts illustrating the accuracy performance of AI training and inference using 16-bit and FP8 formats across different network types, demonstrating comparable results between the two precisions. In MLPerf Inference v2.1, NVIDIA Hopper achieved a 4.5x speedup on the BERT high-accuracy model using the new FP8 format, enhancing throughput without sacrificing accuracy.

To encourage widespread adoption, NVIDIA, Arm, and Intel have released the FP8 specification in an open, license-free format and plan to submit it to IEEE for standardization. By embracing an interchangeable format that maintains accuracy, AI models can operate consistently across diverse hardware platforms, fostering advancements in the field. The article concludes by urging standards bodies and the industry to develop platforms capable of efficiently adopting the new standard to expedite AI development and deployment.

1 Like


  1. Include an example article and an example ‘perfect’ summary in an example in your prompt, where the example summary (of the example article) is in the style, format and word length that you want.


  1. Create a separate function that counts the number of characters or words in the LLM response. If this length is less than what you desire, have it so that it automatically appends another user message saying “Your summary is only X characters long, when it should be Y characters long.” In your code logic include it as a while loop so that it keeps appending another message saying it’s not long enough until you get the desired length.

Option 1 is by far the best and more straightforward option. Just make sure that the example summary that you include in your example is the same format and style that you want the LLM to generate. If your example has grammatical errors, etc, then the LLM is gonna give you a not so impressive output!

People still don’t realise that EXAMPLES are MUCH much more important than INSTRUCTIONS in the prompt. As we all have seen you can give all the instructions in the world, you can try repeating instructions at the beginning and end of the prompt, etc etc, and it still won’t listen to you many times. However, you can include only examples and NO INSTRUCTIONS in your prompt, and, if you have a few examples that are of the same style and format that you want the response to be, then it will reproduce that.

1 Like