§The journal

How AI Detectors Actually Work: A Technical Breakdown

A deep, technical explanation of how AI detectors like GPTZero and Turnitin work. We explain perplexity, burstiness, AI detection algorithms, and why their accuracy is often debated.

Published May 4, 202616 min readBy HumanGPT Editorial
An abstract digital illustration of a neural network analyzing text.

So you've pasted your text into an AI detector. It churns for a moment, the little wheel spins, and then it spits out a score: 87% Human. Or maybe 99% AI. We've all been there, staring at that number, wondering what dark magic produced it. Is it reading the soul of the text? Does it have a tiny, sentient English professor trapped inside? Honestly, no. It's much simpler and, I think, much more interesting. AI detectors aren't magic. They're just very, very picky readers looking for two specific statistical fingerprints. And once you know what they are, the entire industry starts to make a weird kind of sense.

The Two Words That Run the Whole Show: Perplexity and Burstiness

Forget everything else for a minute. If you want to understand how AI detectors work, you only need to know two words: Perplexity and Burstiness. That's it. Pretty much every detection algorithm out there is some fancy variation on one or both of these ideas.

Let’s start with perplexity. It sounds complicated, but the concept is surprisingly simple. Perplexity measures how surprised a language model is by a piece of text. Imagine a language model that has read almost the entire internet. It has seen billions of sentences. Because of this, it has an incredibly strong sense of which word is likely to follow another. If you give it the sentence, 'The cat sat on the…', it's going to bet heavily on the next word being 'mat' or 'couch' or 'floor'. It would be very, very surprised if the next word was 'photosynthesis'.

Low perplexity means the text is predictable. It flows exactly how the AI expects. The words are common, the sentence structure is standard. It’s like a steady, calm heartbeat on a hospital monitor. Beep… beep… beep. Each word is an expected, logical continuation of the last. AI-generated text, especially from older or less sophisticated models, is famous for its low perplexity. It chooses the most probable word, over and over, because that’s its core function: to be a probability machine.

Human writing? It's all over the place. We use weird metaphors. We make strange connections. We choose a less common word because it sounds better. Our writing has higher perplexity. A human might write, 'The cat, a furry ball of pure disdain, perched atop the… antique armoire'. An AI would be surprised by 'disdain' and 'armoire'. Its perplexity score for that sentence would spike. That spike is a human fingerprint.

Then there’s burstiness. This one is easier. Burstiness is all about variation. Specifically, the variation in sentence length and structure. Humans don't write in a uniform rhythm. We write in bursts. A long, flowing sentence full of clauses might be followed by a short, punchy one. A fragment, even. Then another long sentence. It’s like a jazz musician improvising a solo. There’s a rhythm, but it’s complex, varied, and full of personality.

AI models, by contrast, tend to write like a metronome. Their sentences often cluster around a similar length. The structure is often subject-verb-object, repeated with slight variations. The result is a text that feels smooth, maybe a little too smooth. There's no 'burst' of short sentences or long, complex thoughts. It's just a steady, even flow. The lack of variation, the absence of burstiness, is another dead giveaway for a detector.

Method 1: The Perplexity Scanner (How GPTZero Works)

Now let's apply these ideas to a real tool. Remember GPTZero? It exploded onto the scene in January 2023. A Princeton student named Edward Tian built it over his winter break, and suddenly every teacher in America knew its name. It was simple, effective, and, for a while, the only game in town.

GPTZero is, at its core, a perplexity scanner. Its entire model is built on that 'surprise' principle. When you paste text into GPTZero, it doesn't just 'read' it. It runs it through a large language model (ironically, often a member of the GPT family) and measures the perplexity of the entire text, sentence by sentence.

Think of it this way: The detector asks its own internal AI, 'Hey, if you were writing this, how likely would you be to choose these exact words in this exact order?' If the internal AI says, 'Oh yeah, super likely. That's exactly what I would have written,' the perplexity score is low. The detector flags this as likely AI-generated.

But if the internal AI says, 'Whoa, I would never have used that word there. And that sentence structure is weirdly beautiful. I'm very surprised by this,' then the perplexity score is high. The detector marks this as likely human.

This is why AI text, especially unedited output, gets caught so easily. It's a product of its own predictive nature. It follows the most trodden statistical path. It writes sentences that are, by definition, unsurprising to another AI. Tian's great insight was realizing that you could use an AI's predictability against itself. He also added a burstiness calculation to his model, comparing the variation of sentence perplexity. If all sentences are equally predictable, that's another red flag. It was a brilliant and simple application of a core machine learning concept, all cooked up in a dorm room.

Method 2: The Classifier Approach (How Turnitin Works)

Turnitin came to the party a bit later, in April 2023, but they came with a different weapon. As a company that has been processing student essays for decades, they have one of the largest datasets of academic writing on the planet. They didn't just build a perplexity scanner; they built a fine-tuned classifier.

Here’s the difference. Instead of just measuring a general 'surprise' score, a classifier is trained like a guard dog. You show it millions of examples of what you want it to find, and millions of examples of what you want it to ignore. In Turnitin's case, they fed their model a massive dataset containing both verified human-written student papers and a ton of AI-generated text.

The model learned to recognize the subtle statistical patterns that distinguish one from the other. It's not just looking at perplexity or burstiness in isolation. It's looking at hundreds of different features at once: the frequency of certain connector words ('Furthermore', 'Moreover'), the distribution of punctuation, the average sentence length, the complexity of the vocabulary, and many other signals that we humans wouldn't even notice.

Turnitin's detector breaks the text down into individual sentences and paragraphs and assigns a probability score to each segment. This allows it to flag a document as a mix of human and AI text. They initially claimed a 98% overall accuracy rate, which sounds amazing. But the devil is in the details. They also admitted to a false positive rate of around 1%. That sounds small, but if you're processing 100 million papers, that's a million students getting wrongly accused. It's a different, more powerful approach, but it carries its own set of risks, as we'll see.

Method 3: The Watermark Detector (OpenAI's Abandoned Idea)

For a hot minute, there was another idea floating around, championed by OpenAI themselves: cryptographic watermarking. The idea was clever. As the AI generates text, it would subtly embed a secret, invisible pattern into the word choices. For example, it might be programmed to use a word from a specific 'green list' of words at certain intervals. You wouldn't notice it while reading, but a special detector could scan the text, find the pattern, and say with near-certainty, 'This came from our model.'

It sounds foolproof, right? A secret signature. But OpenAI built a prototype and then, quite publicly, killed the project in mid-2023. Why? Because it's actually quite easy to break. All a user has to do is a simple paraphrasing attack. Change a few words here and there, swap a sentence around, or run it through another tool like QuillBot, and the watermark is destroyed. The secret pattern is gone.

Plus, it only works for the model that created it. An OpenAI watermark detector couldn't spot text from Google's Gemini or Anthropic's Claude. It created a closed system in an open-source world. So, while the idea was elegant from a technical standpoint, it was just too fragile for real-world use. It remains a fascinating footnote in the history of AI detection.

Method 4: The Hybrid Approach (Originality.ai and Copyleaks)

So if perplexity alone is good but not great, and classifiers are powerful but have false positives, what's next? The hybrid approach. This is what most of the modern, specialized tools use. Think of Originality.ai, founded by SEO pro Jon Gillham in 2022, or the tool from Copyleaks.

These platforms don't rely on a single signal. They are a kitchen sink of detection methods. They combine perplexity scoring, burstiness analysis, and a fine-tuned classifier all into one process. They're not just asking 'Is this text surprising?' They're asking a whole list of questions:

* How predictable is the text (perplexity)? * How varied is the sentence structure (burstiness)? * Does it use the vocabulary and grammar patterns our AI classifier has been trained on? * Does it contain known factual errors common to specific models? * Does it have the slightly-too-perfect tone of an AI?

By combining the scores from multiple different tests, they create a more nuanced and, theoretically, more accurate final probability. It's a defense-in-depth strategy. One signal might be wrong, but it's less likely that four or five different signals are all wrong in the same direction.

Companies like Copyleaks take this even further, training their models on multilingual data, allowing them to detect AI content that has been translated from one language to another to try and evade detection. The hybrid method is the current state of the art. It's a constant process of adding new signals to the algorithm as AI models evolve and new weaknesses are discovered. It's an arms race, and these guys are building a bigger arsenal.

Why AI Detectors Get It Wrong (The Accuracy Problem)

Okay, so we know how they're supposed to work. But let's be honest, the real question is, do they? The answer is a very messy 'sometimes'. The accuracy claims made by these companies are often based on tests in a perfect lab environment, using clean, unedited AI text versus a specific type of human writing.

The real world is much messier. The biggest problem is the false positive. This is when a detector confidently labels a human's writing as 'AI-generated'. And it happens more than you think. Why?

Because the detectors are looking for statistical patterns, and some humans just happen to write in a way that fits the AI profile. Writers who are not native English speakers (ESL) are often flagged. Their writing can sometimes feature simpler sentence structures and more standard vocabulary, not because they are a robot, but because that's how they learned the language. This puts them at a huge disadvantage.

Highly structured, formal, or technical writing can also trigger a false positive. A scientific paper or a legal document is supposed to have low perplexity. You want it to be clear, logical, and predictable. That's good writing in that context, but an AI detector sees it as robotic.

The consequences are real. In 2023, a professor at Texas A&M-Commerce failed an entire graduating class after pasting their final essays into ChatGPT and claiming it 'confessed' to writing them (which is not how ChatGPT works, by the way). At UC Davis, students were put under investigation based on Turnitin's AI scores, causing immense stress and fear. These tools are being used as judge, jury, and executioner, often without a full understanding of their limitations.

Here’s a look at how marketing claims often stack up against reality:

DetectorClaimed AccuracyIndependent Test Results (Approx.)
Turnitin98%85-90% on clean text, <60% on edited text
GPTZero99% (human)~85% on clean text, highly variable
Originality.ai99%+~94% on GPT-4, but higher false positives
Copyleaks99.1%Good performance, but struggles with humanized text

*Note: Independent test results vary widely based on the specific AI model, text type, and humanization methods used. These are general estimates.*

Look, the accuracy problem is fundamental. The detectors are making an educated guess based on statistics. And sometimes, that guess is just plain wrong.

What Makes AI Text Look Like AI Text (The 7 Tells)

If you spend enough time reading AI-generated content, you start to get a feel for it. It has a certain… flavor. The detectors are just codifying that gut feeling into math. Here are the seven main tells they're looking for, the patterns that scream 'robot'.

  1. 01
    **Uniform Sentence Length:** This is burstiness in action. AI text often has a metronomic quality. Sentences are consistently medium-length. There are few very short, punchy sentences and few long, meandering ones. It's just too even.
  2. 02
    **'Safe' and Repetitive Vocabulary:** Language models tend to pick high-probability words. This means they often avoid interesting, specific, or obscure vocabulary in favor of more common synonyms. You'll see words like 'delve', 'explore', 'showcases', and 'testament' used over and over.
  3. 03
    **No Hedging or Personality:** Humans are messy. We use phrases like 'I think', 'it seems that', 'perhaps', or 'in my opinion'. We inject our uncertainty and personality into our writing. AI models, trained to be authoritative, rarely do this unless specifically prompted. They state things as facts, giving the text a very confident but sterile tone.
  4. 04
    **Excessively Logical Structure:** An AI will often create perfectly structured text. The introduction states what will be discussed. The body paragraphs each discuss one point. The conclusion summarizes what was discussed. It's the five-paragraph essay structure drilled into our heads in high school, executed with machine-like perfection. Human writing often wanders a bit more.
  5. 05
    **Connector Word Addiction:** AI loves transition words. 'Furthermore', 'Moreover', 'In addition', 'Consequently', 'Therefore'. It uses them to link every single idea, making the logical flow explicit to a fault. Humans connect ideas more subtly, often letting the context do the work.
  6. 06
    **Missing Discourse Markers:** This is a subtle one. Discourse markers are the little conversational fillers we use to guide a reader. Words and phrases like 'Look,', 'Anyway,', 'Well,', 'You know,'. They add a human, conversational layer to text that AI models struggle to replicate naturally.
  7. 07
    **Zero Typos. Ever.:** Humans make mistakes. We mistype words. We have grammatical brain farts. AI-generated text is almost always perfectly spelled and grammatically flawless. It's a small thing, but a complete lack of any human error can itself be a statistical anomaly.

How Humanizers Exploit These Weaknesses

So, if detectors are looking for those seven tells, it was only a matter of time before tools emerged to erase them. Enter AI 'humanizers'. These aren't just simple paraphrasers or spinners. The more advanced ones are specifically designed to reverse-engineer the detection process.

They work like a multi-pass pipeline, attacking each AI tell in sequence. First, they tackle the big ones: perplexity and burstiness. A humanizer will go through the text and perform 'perplexity injection'. It will intentionally swap out common, predictable words for less common, more surprising synonyms. It won't just change 'important' to 'significant'; it might change it to 'pivotal' or 'instrumental', raising the text's surprise factor for a detector.

Next, it attacks burstiness. The tool will break apart long, uniform sentences and combine short ones. It will rewrite sentences to create more structural variety. A string of five medium-length sentences might become one long one, two short ones, and one medium one. The goal is to break the metronomic rhythm and introduce a more chaotic, human-like flow.

After that, it's about the details. The humanizer will sprinkle in those discourse markers ('You see,', 'Of course,') and hedging language ('it seems likely that'). It might rephrase overly logical transitions. It will analyze the vocabulary and inject more varied and less repetitive word choices.

Good humanizers are also designed for semantic preservation. This is key. They try to make all these stylistic changes without altering the core meaning of the text. It's a difficult balancing act. The end product is a piece of text that has had its statistical AI fingerprints wiped clean. It's been deliberately engineered to have higher perplexity and greater burstiness, making it much harder for a detector to flag.

The Arms Race: Detectors vs Humanizers in 2026

This brings us to the present and the near future. We're in a classic technological arms race. For every improvement in AI detection, there's a corresponding improvement in AI humanization.

When GPT-3 text was easy to spot, detectors had a field day. Then GPT-4 came along, producing text with naturally higher perplexity and burstiness, making the detectors' job harder. In response, detectors got better, training their classifiers on GPT-4 output.

Then humanizers became popular, specifically designed to fool the new detectors. Now, AI detection companies are actively training their models on humanized text, trying to find the new statistical fingerprints left behind by the humanization process itself. It's a cat-and-mouse game that will only accelerate.

So, what does this look like in 2026? I think we'll see detectors move beyond pure text analysis. They might start looking at behavioral biometrics: the timing of keystrokes and mouse movements in a document editor. They might integrate with tools to verify a document's version history. The analysis will get deeper and more invasive.

On the other side, humanizers will become more sophisticated, better at preserving meaning and introducing more subtle human-like flaws. Honestly, no detector will ever be 100% accurate, and no humanizer will ever be 100% foolproof. The war is statistical, and there will always be an margin of error on both sides. Anyone who tells you otherwise is selling something.

Bottom Line

So, how do AI detectors work? They're statistical analysts. They scan text for two key signals: perplexity (predictability) and burstiness (variety). They use these signals, often combined in complex classifier models, to make an educated guess about the text's origin. They are not arbiters of truth; they are probability calculators.

Their biggest weakness is that some human writing looks robotic, and some AI writing (especially after 'humanization') can look human. A high AI score isn't a guilty verdict. It's a data point. A single, often flawed, data point. And I think we all need to remember that before we let an algorithm make a decision that affects a real person's life or career. Use them as a tool, a guide, but never as a replacement for human judgment.

Frequently asked questions

  • 01What is the main principle behind AI detection?

    The main principle is statistical analysis. AI detectors measure properties like 'perplexity' (how predictable the text is) and 'burstiness' (the variation in sentence length). AI-generated text tends to have low perplexity and low burstiness, making it statistically distinct from typical human writing.

  • 02How does GPTZero work?

    GPTZero, created by Edward Tian, primarily works by measuring perplexity. It uses a large language model to analyze text and determine how 'surprised' the model is by the word choices. Low surprise (low perplexity) suggests the text was written by an AI, which tends to choose the most predictable words.

  • 03Is Turnitin's AI detector accurate?

    Turnitin claims a 98% accuracy rate, but this is under ideal conditions. Independent tests and real-world cases show it can be lower, especially with edited or 'humanized' AI text. It also has a known false positive rate, meaning it sometimes incorrectly flags human writing as AI-generated.

  • 04What is a false positive in AI detection?

    A false positive occurs when an AI detector wrongly identifies text written by a human as being generated by AI. This can happen with writers who are not native English speakers, or with very formal, technical, or structured writing that mimics the statistical properties of AI text.

  • 05Why did OpenAI abandon its AI watermarking tool?

    OpenAI found that their cryptographic watermarking system was too fragile for real-world use. The invisible signature embedded in the text could be easily destroyed by simple paraphrasing, changing a few words, or running the text through another writing tool. It wasn't a durable solution.

  • 06Can AI detectors be fooled?

    Yes. Tools known as 'AI humanizers' are specifically designed to fool detectors. They alter the text to increase its perplexity and burstiness, add human-like phrasing, and remove the statistical giveaways that detectors look for, making the text appear human-written.

  • 07What is the difference between perplexity and burstiness?

    Perplexity measures the predictability of word choice. Low perplexity means the text is very predictable, like an AI would write. Burstiness measures the variation in sentence length and structure. Low burstiness means sentences are all of a similar length, which is another common trait of AI writing.

  • 08Will AI detectors become 100% accurate by 2026?

    It's highly unlikely. The relationship between AI generation and AI detection is an 'arms race'. As AI models get better at writing like humans, and humanizers get better at evading detection, detectors must constantly adapt. Because detection is based on statistics, there will probably always be a margin of error and the potential for false positives.

  • 09Do AI detectors check for plagiarism?

    Not necessarily. AI detection and plagiarism detection are two different processes. A tool like Turnitin or Originality.ai might offer both services, but they are separate functions. AI detection checks for the statistical patterns of machine writing, while plagiarism detection checks for copied content from existing sources.

  • 10Can using an AI assistant for outlining get my work flagged?

    Using an AI for brainstorming, outlining, or research is generally not detectable, as you are still writing the final text yourself. Detectors analyze the final prose. As long as your own writing style, with its natural perplexity and burstiness, is what makes it onto the page, it is unlikely to be flagged as AI-generated.