Consistently Precise (and Improved?) GPT4 Word Count Constrained Output Using a 'Self-Tagging' Technique

I spent the weekend conducting a bit of experimentation with GPT-4, with a goal to better understand the limitations in regard to mathematical reasoning and the ability to adhere to strict output guidelines. I have some understanding of how tokenization works and the issues it causes with the model’s ability to count words specifically. However, I had consistent success with having GPT4 perform long multiplication in part due to the use of grouping and labeling, as seen here [ youtu.be/KqjNCuzNRH8 ], so I decided to approach the problem constricting word count on output similarly by labeling the model’s output. Specifically, the approach was to instruct each word to be wrapped and paired with a sequential number, its tag, (1- Like) (2- so). The goal was increased attention bias towards the # and recognition of each word as a whole, rather than the token.

After some trial and error via API chat completions, I’m receiving remarkably accurate output using this self-tag technique. What caught me completely off guard, however, was the output quality, which was seemingly improved in ways, yet adhered strictly to the constraints. One would expect the extra compute and interruption of natural language would potentially cause the opposite. I’m not sure what to make of this yet.

Here’s an example chat completion, an exact 300-word essay, as it happened. I’ll include the actual output with the tagging removed at the end of the post:

youtu.be/NFzGApBw8P8

It appears GPT-4’s ability to accurately conform to a specified word count is absolutely possible when using a self-tagging approach to increase its attention ability. Additionally, the model’s ability to structure content as if it had foresight, especially in the context of a fixed word limit, suggests a level of sophistication beyond basic next-token prediction. What mechanisms within GPT-4 might facilitate this apparent planning? What other areas can be improved with this concept of having the model essential label its own data at the output level?

Does anyone know of any research papers on these topics? My aim is to explore these concepts with a larger data set and sample size; however, I need to address API cost to do so.

Full output below:

User: ‘Write me a 300 word essay on the ecology of the desert cactus’

Assistant:

"The desert cactus stands as a unique beacon of resilience in the harshest of environments. Its ecology is a complex web of interactions with various abiotic and biotic factors. Survival in the desert calls for exceptional adaptations, which cacti have in spades. Firstly, their thick skin reduces water loss. Significantly, they possess spines instead of leaves, minimizing the surface area for transpiration. The roots of desert cacti spread wide to capture sparse rainfall, while their stems store water for dry periods.

In terms of reproduction, the vibrant flowers attract pollinators, ensuring genetic diversity. Moreover, they offer shelter and sustenance to desert wildlife, thus playing a pivotal role in the desert ecosystem. Their presence assists in soil stability and prevents erosion by anchoring the sand.

Cacti are also a crucial component in the desert food web, serving as a resource for birds, insects, and mammals. Some species even rely on the cactus exclusively for water and nourishment during droughts. The relationship between cacti and desert inhabitants is symbiotic as some animals help in seed dispersal, thereby aiding cacti in propagation.

However, human activity and climate change are threatening these ecosystems. Habitat destruction, illegal collection, and the introduction of non-native species have the potential to disrupt the delicate balance. Desert cacti are adapted to survive extreme heat and aridity, but they are not invincible.

Conservation efforts must be enacted to protect these resilient plants and their roles within the ecosystem. Cacti epitomize the beauty and complexity of desert ecology, a testament to life’s perseverance. Even in adversity, they thrive and contribute to the biodiversity of their arid habitats. It is essential that we recognize and maintain the significance of desert cacti, not only for their intrinsic value but also for the survival of desert ecosystems and the myriad forms of life therein." (300 words)

2 Likes

Hey this is really cool! Welcome to the forums. Thanks for the detailed explanation, I’d like to know more about about the topic. Care to elaborate on your tagging method in more detail?

Appreciated, I’ve just updated the main post.

The original discussion was on r/OpenAI - Watch GPT4 Create A Precision 300 Word Essay By Utilizing ‘Self-Tagging’

Hi, I spent quite some time on the topic for over a year already. Here is an OpenAI response from over a year ago.

Unfortunately, this is not a feature that currently exists in our API. However, we recognize the value and demand for this functionality and we will log this as a feature request internally. We cannot guarantee when or if this will be implemented, but we will keep you posted if we have any updates.

It worked better with InstructGPT and text-davinci-003 model, and seems to have gotten less accurate with the Chat models. GPT4 did not add much.

The sample from @NeuralNexus works well but the more context and complexity you give it in the prompt, the bigger the chance that it will drift.

These solutions do add quite a bit of time to the writing, for like 100 words it ads up to a minute or so.