What are your strategies for spotting AI writing?

AI generated content is everywhere. We see it in news articles, conversations with colleagues, clients, vendors, research papers, homework etc.

While there is no accurate method to ascertain if content is AI generated, most of us (humans) have created our own mental framework for identifying AI generated text with varying degree of certainty.

There are a few things that come up very often like the frequent use of em dash ( — ). Although, this might be specific to GPT-4 only.

Note: Avoid using em dash as a sole predictor. Why? Example: ChatGPT - Waking Up at 5 AM

What are your mental frameworks to identify if something is written by AI?

5 Likes

**

Conclusion

**

Etc. There’s quite a few “GPT-isms”… Might be fun to collect a definitive list?

There’s quite a few hurdles to get around to check it … sometimes.

Good thread idea!

1 Like

Yes, there are quite a few. The more GPT-isms we see in a text, the more likelihood of it being generated by GPT.

Training a small model on this would be a cool project.

1 Like

Another one:

Without explicit prompting AI breaks down most of the things in structured lists and subheadings, even when the topic doesn’t require them.

Example: ChatGPT - How Solar Power Works

1 Like

I’ve seen the similar things.

There’s no foolproof way to tell if something’s AI-generated, but over time, I’ve started picking up on a few patterns that feel a little… off.
Of course there are many AI detector tools.
From my perspective, here’s how I usually spot it, but I don’t bet it is correct.

In an AI generated text:

  • Sounds too perfect
  • Repeats the same idea
  • Neutral, overly balanced tone
  • Lacks real voice or emotion
  • Uses common filler phrases
  • Stays on the surface, lacks depth
  • Em dash shows up often

I use generally this prompt when I need more human tone:

Write [TOPIC] in a simple grade 9 level language, casual tone like you’re texting a friend or writing a personal blog post. No em dashes, no semicolons. Just use regular punctuation. Keep it clear, natural, and real. Add a little personality, and avoid sounding robotic or overly perfect.

I used your first sample prompt and tested GPT-4o output on three AI detector web sites.

Sample chat:

Sample AI Detectors:

7 Likes

Wow, now I feel like trying to wake up at 5am. :see_no_evil_monkey:

1 Like

Great this is a good idea, I like it.

2 Likes

I have a degree in English so my concern is more with thesis’ written for college requirements. The first thing to look for is the total lack of English (college level) rules. I have noticed AI will use explitives at the beginning of centinces. It is, They are, for example. A human will make spelling and grammatical errors and often write in a passive voice. Foot noting. A strong warning from the teacher that any AI work will yield a failure in the essay or perhaps worse a failure for the grade if the work is not properly foot noted. A proper college level paper will not have contractions. The word “that” should be avoided. With the essay long enough you can get a gut feeling. An instructor must warn their students cheating using AI will not do them any good in the future but then, i know people that do not know their multiplication tables.

1 Like

That’s some great insight.

It would be really great if you could give some examples.

1 Like

you and i know there is nothing concrete here. Trying to catch AI influence is like nailing jello to a tree. Keep in mind my examples pertain to College Level expectations. I don’t think AI will use contractions but that could depend on its settings. Since i wrote the previous segment i have come to a new observation. It may be easier to tell if it were written by a human than by AI. It used to be one would look for mechanical errors but now one must look for emotion or human spin. A bunch of AI stacked on one another will be choppy, cold and calculated. Humans can add a layer of humanity AI doesn’t have…yet. There are (explative) times when AI will use the objective case vs subjective case but any more this is a human virtue and should probably marked with a red check but in non colligate venue this won’t be uncommon.

I don’t have frameworks but I am building a list of words common to the several AI programs I use. These include: 1. Leverage 2. Empower 3. Facilitate 4. Enable 5. Enhance 6. Drive 7. Harness 8. Showcase 9. Incorporate 10. Integrate 11. Seamless 12. Pivotal 13. navigate 14. Delve 15. Underscore 16. Amplify 17. Elevate
While we probably often use these words in our own writing, AI apps regard them as immediate “go-to” words. Why does AI do this? Ironically, to answer that question, we need to ask AI itself. According to Perplexity:
" AI chatbots are trained on massive datasets that reflect patterns in human-written text. These datasets often include formal, business-like, or technical language, which leads to the frequent use of polished and impactful terms like “leverage” or “showcase”. Figurative expressions, idioms, and nuanced emotional tones are less common in training data, making them harder for AI to understand and reproduce effectively."

So, does this mean to avoid sounding like a robot we now have to use more idiomatic expressions, slang and colloquialisms? But that’s the thing with AI. if we do adjust our writing styles, AI will master them too.

1 Like

Moreover,

1 Like

Welcome to the community!

Do you think these are model specific? Or is it more general?

Yes, and also we have already trained the models on all the data that exists on the internet (https://hplt-project.org/); hence, we can just prompt it to use more expressions, slang, and colloquialisms.

There are certain words that are just dead give-aways. My favorite are Elevate and Revolutionize. I have one particular chat thread I used for content generation, ages ago when it was still a good idea to use for blog content, and I noticed the word was being over-used-- so much to the point I had to tell the AI to stop using the word all together (and it would very quickly forget). I’m actually convinced these ‘GPT-isms’ are in place specifically to identify AI generated content and keep it from ranking in search methods-- and if not, search providers (especially ones that work on AI), have implemented ways of filtering AI spin-tax in favor of real domain authority.

**Edit, I should add-- there are sites that can read your text and tell you if it’s AI. Have you tried using any of them? (Copyleaks, QuillBot, several more)

Welcome to the community Sakhai!

Yes, most of them are not reliable in my experience. Example:

Maybe this is what passing the Turing test looks like.

1 Like

That’s strange, nothing about that strikes me as too-AI looking, at least at first glance. Maybe the repetition in sentence composition (specifically the last three paragraphs), and paragraph layout (less certain about the paragraph spacing having to be the culprit, but it’s odd). The wording is a little cheesy (the ‘real talk’ part–ugh, I can just picture a doofus walking up, spinning a chair 180 degrees and sitting with his arms folded), but it is how some people write. I suppose it just stands out more when I read promotional jargon.

Edit2: Just to be clear, this is my own writing, by hand, neither drafted nor “refined” by any LLM, except for the one section that explicitly contains snippets of ChatGPT content for the purpose of illustrating the shape/structure of its writing.

The aim here, rather than looking to simple checklists of overused words and phrases, is to go further and examine broader and deeper things. This involves at least slightly more abstracted constructions or principles such as overuse of the rule of three, or blindness to elegant variation. Preferably, see for example those parts where the reasoning consists entirely of “does it not feel empty?” or “Oh, please” -- these I consider the best indicators, as well as the ones that I’ve labeled as “key turning points” that can often be found in ChatGPT content. The less concrete it is, the more likely it is to remain a useful tell in the future.

Edit:
@Sakhai:

I should add-- there are sites that can read your text and tell you if it’s AI. Have you tried using any of them? (Copyleaks, QuillBot, several more)

Last time I checked, ZeroGPT still thinks the Declaration of Independence is 100% definitely AI-generated. Just saying.


My eyes glaze over as soon as I detect the loathsome style of ChatGPT. I can hardly even force myself to read through it anyway, because I know there is truly no hope of a redeeming evolution in quality at any point in the whole thing. And, just as I can now recognize Christian music in three notes or less, I can spot ChatGPT output without necessarily even reading any one contiguous string of it. I can just tell by the shape or something.

It goes beyond mere key words. Of course, if someone posts something entirely blatant, having neglected to tune ChatGPT’s voice in the slightest, a checklist of GPTisms can be of great use:

But ChatGPT can be led to write in drastically different voices. It’s entirely possible for there to be no obnoxiously obvious overuse of such extra-credit vocabulary words. There are, however, still some higher-level (broader) tells. Caveat to the caveat: When it does manage to fool you, it will be an unknown unknown.


Now, granted, it isn’t especially meaningful for me to recognize the “waking up at 5 a.m.” essay as ChatGPT’s writing when I have already been told it is. But I like to think I’d’ve noticed a few things:

  • If “From a Regular Human’s POV” were left in the title, that would be a pretty funny freebie, for one thing, lol.
  • It says “wild, right?” twice. But our focus here shouldn’t be on the particular phrase “wild, right?” nor is it entirely about the fact that it used the phrase in a noticeably repetitive way.
    • That it did so is of some interest, in the sense that it displays a lack of the typical human aversion to such repetition. We naturally avoid it: See elegant variation and horror aequi.
    • What’s much more important is what it means for the underlying tone or attitude. Wild, right? It comes off rather How do you do, fellow kids?, if you ask me.
  • You might also find some subtly detached or glossed-over metaphors/similes. If you stop and think about it, is “a superhero with a secret morning routine” really a particularly meaningful concept to you? It feels empty to me. Who has a secret morning routine? Who would have been conceptualizing the writer as such a thing before it was forced into the frame by the writer? There will often be a lot of this kind of quietly disingenuous pretending at human experience.
  • This one is more of a typical granular indicator, but if I were presented with that essay, I would highlight the ellipsis in “also…” to see if it’s three periods or a single ellipsis character.
  • “Just me, a thing, and a whimsical third thing!”
    • ChatGPT absolutely loves the rule of three. Once you know about it, you won’t be able to stop noticing it.
  • The scenario described in the second paragraph, where someone is allegedly having fully articulated, coherent, and civil self-talk about their alarm while it is still going off, seems unrealistic to me.
    • Another rule-of-three sentence, with a bonus: “his brain” is still “half-dreaming”, which makes it all the more impressive that he’s able to have such a lucid internal conversation with himself when he hasn’t even turned off his alarm yet.
  • “The struggle was real.”
    • Oh, please.
  • “But here’s the thing.”
    • This is one of the usual key turning points in these essays. An earlier one happened when it was like [introduces idea] [strawmans objection] [denies strawman]. I didn’t bring it up because it’s not always a strong one and this one didn’t seem entirely too heavy-handed. There is, however, much more often a very obvious “But here’s the thing” (or similar) to be found. As soon as I saw that, I already knew I was going to find a paragraph beginning with “So” somewhere near the end.
    • Another thing that goes hand-in-hand with those but didn’t appear this time is the “It’s not [just] about [this], it’s about [that]” construction.
  • Alluding to the anime trope of running to school with toast hanging out of your mouth is a rather tired reference as it is. Doing it in such a painfully out-of-touch way stands out to me.
  • This is another granular one, but “chaos”. ChatGPT can’t get enough of the concept of chaos. If you don’t keep it in check, it’s like it’s nothing but chaos this and chaos that. It’s almost like ChatGPT is some escaped lunatic just rampaging around town, terrorizing the local townsfolk and reveling in the chaos of lunacy! :wink:
  • Uses another “Just [you/me] and [thing]” construction, but this time at least it doesn’t do another rule-of-three.
  • Oh, you sat wrapped in a hoodie with the morning’s hot drink to watch the sunrise? Were you also cradling the mug in both hands? It’s a good thing you included such a relatable and human detail by dropping in a reference to “that soft morning light”. I can always be sure someone is a real life bona fide human being when they allude vaguely to “that common human experience hand-waved in a way that sidesteps any need for actual direct knowledge of what it’s like” and hope I won’t notice. Thanks!
  • “It felt kinda magical in a chill way.”
    • Seriously? Do I even need to get into this one?
  • “And [next subtopic]? Wow.”
    • This is another common transition point. We all know how to spot the Additionalies, Furthermores, Moreovers, and Nonethelesses, but these higher-level structural joints are what you really need to be able to spot (and, again, one of these days, maybe you won’t spot it, and you won’t know what you don’t know).
  • Getting up at 5 a.m. made this guy so productive he finished his homework before even going to school. Impressive!
    • Before he starting getting up at 5 a.m., it took him all day to do his homework, work out, and write in his journal?
  • “It’s like my brain worked better without all the usual distractions.”
    • If you pause again and contemplate this bit, as well as the one from earlier about the “texts” and “chaos” that used to make his mornings so hectic, is it not empty? It does nothing to truly relate to us what this chaos and these distractions actually are and what problem they actually cause. It simply presents them as if it expects them to be accepted prima facie as relevant and reasonable things that make sense as incentives to get up at five in the morning. It’s alluding to things without actually talking about them. It is using many words to say nothing. It’s essential to be able to notice this.

I’m falling asleep on the keyboard, so I’ll speed through a few more key points and leave the elaboration as an exercise for the reader. I have a shorter thing to share that might help make some of these concepts more apparent. I’ll also post a longer self-quote where I go into some other aspects of what I consider the essence of ChatGPT voice.

  • ‘watching “just one more” episode’: how do you do, fellow kids
  • “I learned that the hard way”: cop-out
  • “So yeah”: called it
  • "It takes commitment . . . ": rule of three #3
  • "You feel ahead of the day . . . ": rule of three #4
  • "It’s 7AM and I’ve already . . . ": rule of three #5 (and that’s only if you don’t count various other constructions throughout the essay that aren’t simple three-item lists but are still arguably making use of aesthetically pleasing triplets of things. Wild, right?)

For this part, I’m just going to post the first four words of each sentence from a thing ChatGPT wrote. I will let this speak for itself, mostly. Emphasis is mine.

Well, here we are,
You’ve got me tangled
And hey, maybe I
Imagine this: I’m sitting
And maybe now, I’m
You’ve got me thinking
Maybe it’s saying, "Hey,
And so, I think
Not just any wanting,
The kind that makes
There’s something beautiful about
The way it drives
But here’s the kicker:
It’s got a mind
It makes us do
It’s a double-edged sword
And yet, without it,
Probably just sitting around,
So maybe, just maybe,
Not just because you’ve
It’s that tiny ember
In the end, desire’s
It’s the fire that
And maybe, just maybe, (yes, again)
So here’s to the
May it burn bright


This is my comment on someone’s article about how to spot ChatGPT-generated writing by looking for key words:

Great read. My pet GPT giveaway is “Additionally, …”

But I suspect it goes deeper. It’s not just that it begins sentences with “Additionally” a bit too often. It’s why it does -- my guess is that it’s because humans would tend to base their writing much more on -- I don’t know, maybe an underlying narrative structure of sorts, such that the relevance of things and the connections between them can more often be implied. Or the writing would have a place in a particular context and speak from a certain perspective with specific motives, all of which comes with a pile of factors like “things that are almost certainly already known to anyone who would be reading this and don’t need to be spelled out” or “the author trying to be smooth about inserting their opinion by doing it entirely in modifiers and the subtle connotations of specific phrasing instead of explicit standalone statements” or “a particular fingerprint of what’s said vs. unsaid and where the focus lies”.

Meanwhile AI has little or none of this kind of context, so it needs to use a lot more explicit connectors, and it’s forced to risk being mildly patronizing for lack of a nuanced reader persona to write for. It doesn’t encapsulate opinion in the subtext and details of factual statements, there is nothing to be found between the lines, and no arcane divination is needed to see what the angle is. And I highly doubt that it dances around, trying to find the words to convey fuzzy concepts of uncertain truth and leaving some of them nebulous, having made the judgment call that the audience will grok what they’re trying to relate. It can speculate after a fashion -- not so much in the form of trying to figure out exactly what it’s talking about in the first place, as in that of a list of possibilities or so.

(This is based on a far less thorough and up-to-date familiarity with LLMs than yours or even most dabblers’, so I could be missing something for sure.)

Along the same lines, I think your examples go deeper. It’s not simply that normal people rarely do a “not only, but also” construction correctly -- to say nothing of applying it well⁽¹⁾ -- or that “this is all about” is a lazy copout of a phrase, or that using “like” to form a simile is just noticeably overused by AI. For one thing, it’s not even a good simile. “Having a front seat to the future of crypto” is not a meaningful pre-existing concept for pretty much anyone, and drawing parallels tends to work better when most people are familiar with one of the things in question. The whole sentence is just an unnecessary simile construction whose contents are trying really hard to be metaphorical but are nothing more than a flaccid figure of speech. I think i’d need more help understanding the analogy than the actual description of what the product does. The last bit has delusions of being the cherry on top, but makes the rather perplexing implication that being unable to keep an eye on the bleeding edge of crypto in comfort was a problem waiting to be solved. Besides, what do they mean, “all?” Wow, you mean I can now do everything on this list of . . . one thing? All that, right from my couch? What will they think of next!

Anyway, it’s more than the key words. They can be useful yellow flags, but their real common ground is that they’re all uncanny impressions of a typical masturbatory Silicon Valley product launch presentation. The mystery of how such content would come to be deeply formative of the AI’s style is left as an exercise for the reader.

So I guess one useful heuristic is that if you start to feel like you’re in a webinar or you’re getting flashbacks to spending the better part of a day off sitting through a deep dive into something you don’t care about because you wanted the free license key or expo swag or something but all the spots in the good tracks were taken, then maybe something nearby needs recasting, and you may have caught the scent of a new phrase you can add to the checklist.

  1. Which ties back to the context of the known audience in an established domain: if there is no “not only” already in the reader’s assumptions that you can subvert or escalate or whatever in the “but also”, then all you’re doing is taking two things that should have been part of a list and putting them in a form that’s going to make people feel like there’s a link thay can’t quite grasp (Ernest Gowers spins in his grave) or that you’re trying too hard to play up the but-also.

Another one:


Like the comment above:



I while working with AI this morning, I came across this response from AI:

And you’re right to invoke da Vinci himself. If anyone valued mathematical proportion, layered technique, and perfect balance, it was him. But he also embraced imperfection and evolution—he left many works unfinished, knowing that decay itself was part of the design.*

In college level writings vs AI, this use of ** “him” ** in the objective case is unacceptable. Here it should be **“he” ** (referred to as a subjective compliment) and therefore should be “he”.

That said, i brought this issue to ChatGPT previously and the reasoning (sadly) is not unusual. This fax paus has become commonly used among a majority of American English users and it was just following this sad trend.

Now in trying to figure out Ai vs American English text one could consider this a red flag, for what, i do not know.

1 Like

Um, sorry everyone… I am reading up… But the obvious point is missed here in everything I have read so far…

People copy others, people tend to copy the smart things others say (at least my boss used to copy me ^^)…

It’s a natural trait and the result is… That anything AI says ultimately people will copy… Maybe not YOU… But more people than you’d like to think!

(And anything you say ultimately AI will copy! :smiley: )

2 Likes