Issue with AI-generated text: Concatenated words without spaces

I am using openAI 3.5 to generate content. Lately I am getting outputs with a lot of concatenated words with no spaces. Literally some full sentences/paragraphs with all the words concatenated, with no spaces. Plus some grammar problems as well. Does anyone have a solution to similar problems? Thank you in advance.

it’s been happening in the last few weeks consistently. It started happening maybe 1 out of 10 times, and now it happens pretty much every single time.

Could you post a sample of what caused it to generate this content? I haven’t seen it do this, but often things like this are related to the conversation.

Does it still happen if you start a new conversation as opposed to continuing an old one?

Sorry it’s not about chatgpt, which I have not seen this happen neither. The problem I am having is with the content generated using OpenAI API calls. The prompt used is quite simple, for example: You are a professional content writer, write an article about Summer. Use 6 paragraphs with 150 words in each.

Kindly ensure that you pay close attention to the category and tags you have selected.

The category Prompting leads us to understand you are talking about a ChatGPT prompt.
The tag chatgpt further enforces this idea.

I would suggest considering making adjustments to both of them in order to better align with your chosen topic.

The solution is likely to specify and reduce the temperature parameter. Temperature can cause non-ideal token generation. Start with 0.0 and work up from there to 0.6 if you need varied outputs to the same inputs. Reduced quality of generation (quantization?) already now gives you a simulation of higher temperature.

A few examples in earlier context is all it takes to cause unbreakable “training” with the way gpt-3.5 is now paying more attention to prior biases.

More training on code and function calling may have also increased the weight given to the next token possibility being a word without the preceding space included in its token, such as functionName that includes camelCase.

On top of the most recent reply about temperature, you may also want to reduce top_p - if it is 1 it can lead to the consideration of highly improbable next tokens in outputs

Yes, I should have used the API Tag here, but the category is correct, as it is related to prompt engineering.

Thank you for the info, I have tried with different temp, I do notice that with lower temperature it happens less often, but it still happens. When I started the project with GPT 3.5 (back in April-May) it rarely happened, now it is just constantly happening. However, using the same prompts and keeping all the other parameters constant - changing only the gpt model to gpt-4, this problem disappears… feels like I am forced to upgrade the model now.

Thank you for the info, I’ll try it :slight_smile: