The two core metrics: perplexity and burstiness
Perplexity
Perplexity measures how surprising or unpredictable the next word in a sentence is. Human writing tends to have higher perplexity because we make creative, unexpected word choices.
AI models optimize for the most statistically likely next token, producing text with low perplexity — predictable, smooth, and almost too polished. When every sentence reads like it was chosen by a probability engine, that is a strong AI signal.
Burstiness
Burstiness captures how much sentence length and complexity vary throughout a text. Humans write in bursts — short punchy fragments mixed with complex, winding sentences.
AI-generated text has low burstiness. Sentences tend to be uniformly medium-length, with similar structure and rhythm. This monotony is one of the easiest tells for a detection algorithm.
The 6 pattern categories DetectAI analyzes
Beyond perplexity and burstiness, DetectAI scores text across six distinct dimensions. Each contributes to the final AI probability score.
Sentence Structure
AI models produce sentences with remarkably uniform length and syntax. DetectAI measures the standard deviation of sentence lengths — the more uniform, the higher the AI signal.
Vocabulary Patterns
LLMs over-rely on certain words: "delve," "crucial," "landscape," "leverage." DetectAI maintains a curated list of AI-favorite terms and scores their frequency against human baselines.
Transition Words
AI text is addicted to connectors: "Furthermore," "Moreover," "In conclusion." Human writers vary their transitions or skip them entirely. High transition density flags AI.
Paragraph Flow
Human paragraphs vary wildly in length and purpose. AI paragraphs tend to follow a rigid pattern: topic sentence, supporting detail, concluding thought. DetectAI detects this structural monotony.
Punctuation Diversity
Humans use dashes, semicolons, parentheses, and exclamation marks irregularly. AI sticks to periods and commas with mechanical consistency. Low punctuation variety raises the score.
Lexical Diversity
The type-token ratio — unique words divided by total words — reveals how varied the vocabulary is. AI text often recycles the same phrasing patterns, producing a lower diversity ratio.
What makes AI text different from human text
| Characteristic | Human Writing | AI Writing |
|---|---|---|
| Sentence length | Highly variable — 3 to 40+ words | Uniformly 15-25 words |
| Word choice | Idiosyncratic, personal vocabulary | "Delve," "crucial," "landscape," "nuanced" |
| Paragraph structure | Varies by mood and intent | Rigid topic-support-conclusion pattern |
| Transitions | Sparse, implicit, or absent | "Furthermore," "Moreover," "Additionally" |
| Errors & quirks | Occasional typos, informal tone shifts | Grammatically perfect, consistently formal |
| Emotional range | Genuine frustration, humor, sarcasm | Measured, balanced, diplomatically neutral |
Why some AI text is harder to detect
Custom prompts and persona instructions
When users instruct AI to "write casually" or "vary sentence length," the output mimics human burstiness more closely. The more specific the prompt, the harder it is to distinguish from human writing.
Human editing after generation
Lightly editing AI output — swapping words, breaking up sentences, adding personal anecdotes — disrupts the statistical patterns detectors rely on. Even small edits can significantly lower the AI score.
Newer, more capable models
Each generation of LLMs produces more natural text. GPT-4, Claude 3.5, and Gemini Ultra generate text with higher perplexity and better burstiness than earlier models, making detection more challenging.
Short text samples
Detection accuracy improves with longer text. Below 50 words, there is not enough statistical signal to reliably identify patterns. For best results, analyze at least 100-200 words.
See it in action — try DetectAI now
Paste any text and watch the 6-category breakdown in real time. 100% free, no signup, instant results.