🎲 Likelihoodlum

Detect whether a GitHub repository's code was likely written by an LLM.

Likelihoodlum analyzes a repository's commit history and uses timing-based heuristics to estimate the likelihood that the code was generated by a large language model rather than written by a human.

The core idea is simple: humans type slow, LLMs don't. If someone is pushing hundreds of lines of polished code every few minutes — or shipping an entire app in a week — something's up.

🏆 The Wall of Truth

Real results from real repos. Every score below was generated by Likelihoodlum with default settings (--max-commits 200).

🤖 Caught Red-Handed

Repository	⭐ Stars	Score	Verdict	Daily Output	Velocity	Authors
anthropics/claudes-c-compiler	2,399	81/100	🤖 Very likely LLM-generated	60,638 lines in 1 day	7.6 l/min median	`claude` (198 commits)

Anthropic literally named the author claude. 198 of 200 commits. 60K lines in a single day. The tool didn't even break a sweat on this one.

🤔 Suspicious — You Decide

Repository	⭐ Stars	Score	Verdict	Key Signals
openai/codex	—	35/100	🤔 Possibly LLM-assisted	9,562 lines/active day, 17% LLM message patterns
jlowin/fastmcp	—	34/100	🤔 Possibly LLM-assisted	3,046 lines/active day, extreme session productivity
rust-lang/rust	—	32/100	🤔 Possibly LLM-assisted	7,032 lines/active day (merge commits inflate this)
Significant-Gravitas/AutoGPT	—	29/100	👤 Likely human-written	74% LLM message patterns, but 14 authors + low velocity
microsoft/vscode	—	28/100	👤 Likely human-written	Large merge commits inflate daily output
twitter/the-algorithm	—	26/100	👤 Likely human-written	13,379 lines/active day — bulk repo dump

👤 Certified Human

Repository	⭐ Stars	Score	Verdict	Median Velocity	Authors	LLM Messages
vuejs/core	—	20/100	👤 Likely human-written	0.0 l/min	52	66%*
pallets/flask	—	17/100	👤 Likely human-written	—	23	0%
meshtastic/meshtastic-apple	—	16/100	👤 Likely human-written	0.3 l/min	8	4%
denoland/deno	—	15/100	👤 Likely human-written	0.1 l/min	41	56%*
langchain-ai/langchain	—	15/100	👤 Likely human-written	0.1 l/min	44	57%*
vercel/next.js	—	14/100	👤 Almost certainly human-written	0.2 l/min	29	—
facebook/react	—	10/100	👤 Almost certainly human-written	0.1 l/min	29	—
pydantic/pydantic	—	10/100	👤 Almost certainly human-written	0.1 l/min	47	18%
stackblitz/bolt.new	—	2/100	👤 Almost certainly human-written	0.1 l/min	17	4%
godotengine/godot	—	2/100	👤 Almost certainly human-written	0.1 l/min	51	—
golang/go	—	0/100	👤 Almost certainly human-written	0.0 l/min	85	0%
django/django	—	0/100	👤 Almost certainly human-written	0.0 l/min	73	0%
bitcoin/bitcoin	—	0/100	👤 Almost certainly human-written	0.1 l/min	23	0%
sveltejs/svelte	—	0/100	👤 Almost certainly human-written	0.1 l/min	36	1%
tinygrad/tinygrad	—	0/100	👤 Almost certainly human-written	0.1 l/min	15	—
nixos/nixpkgs	—	0/100	👤 Almost certainly human-written	—	41	0%
torvalds/linux	—	41/100†	🤔 Possibly LLM-assisted	1.6 l/min	35	0%

* High message pattern % from conventional commit style (feat():, fix():), not actual LLM usage.

† Linux kernel scores higher than expected because Torvalds merges massive subsystem PRs — each merge looks like thousands of lines appearing instantly. The multi-author discount (−10) keeps it in check.

🔬 The Anthropic Spotlight

Because who better to test an LLM detector on than the company making the LLMs?

Repository	Score	Verdict	Notes
anthropics/claude-code	0/100	👤 Almost certainly human-written	20 authors, human velocity, CV=7.62. Real team, real software.
anthropics/anthropic-sdk-python	0/100	👤 Almost certainly human-written	159/200 commits were bots (filtered out). The 41 human commits? Glacial pace.
anthropics/claude-code-action	0/100	👤 Almost certainly human-written	35 authors, every negative signal fired.
anthropics/skills	3/100	👤 Almost certainly human-written	10K lines/active day triggered +20, but −25 in negatives crushed it.
modelcontextprotocol/servers	0/100	👤 Almost certainly human-written	39 authors. Community-driven.
anthropics/claudes-c-compiler	81/100	🤖 Very likely LLM-generated	The one that proves the tool works. Author is literally `claude`.

The irony: the company building the most capable coding LLM in the world writes their own code by hand. Except when they let Claude write a C compiler for fun — and the tool caught it instantly.

How It Works

Likelihoodlum fetches commit history and repository metadata via the GitHub API, then scores the repo on a 0–100 scale across twelve heuristic signals. Generated and vendored files (lockfiles, protobufs, Xcode project files, build artifacts, etc.) are automatically filtered out so they don't inflate velocity measurements. Bot accounts (e.g. dependabot[bot]) are excluded from author counts and velocity calculations.

Commit details are fetched concurrently (up to 10 parallel requests) for significantly faster analysis, and bot commits are skipped entirely to save API calls.

Scoring Signals

#	Signal	Points	What It Measures
1	Code Velocity	−10 to +35	Lines changed per minute between consecutive commits by the same author. When the trimmed mean is significantly higher than the median (heavy tail of fast intervals), the score is boosted further.
2	Session Productivity	−5 to +20	Groups commits into coding sessions (>2hr gap = new session) and measures aggregate lines/min. Human-pace sessions actively reduce the score.
3	Commit Size Uniformity	−5 to +15	LLM dumps tend to be uniformly large. Human commits vary in size (small fixes, big features, etc). High variation is rewarded with negative points.
4	Commit Message Patterns	0 to +15	Catches generic messages like "Implement X", "Add Y functionality", "Fix issue with Z", conventional commits with verbose scopes or multi-scopes (`feat(a, b):`), and other LLM-typical phrasings. If messages look clean but velocity is very high, a small cross-signal bonus is applied.
5	Burst Detection	0 to +15	Flags sessions where >300 authored lines appeared in under 30 minutes (rapid bursts), plus longer sessions with sustained extreme throughput (≥10 lines/min).
6	Multi-Author Discount	−10 to +5	Real projects tend to have multiple contributors (score penalty). Solo-author repos get a small bump. Bot accounts are excluded from the count.
7	Extreme Per-Commit Velocity	0 to +10	Counts commit intervals exceeding 50 lines/min (~3,000 lines/hr). Even a small percentage of these is a strong signal.
8	Commit Time-of-Day	0 to +5	Flags repos where >30% of commits happen between midnight–6am and velocity is suspicious. Humans have circadian rhythms; LLMs don't sleep.
9	Comment Density	−3 to +5	LLMs over-explain — they add verbose comments, docstrings, and inline explanations at a much higher rate than most humans. Very low comment density (typical human laziness) earns a negative signal.
10	Diff Entropy	−3 to +5	Measures Shannon entropy of diff content. LLM-generated diffs tend to be more repetitive/formulaic (lower entropy). Human diffs are messier and more varied (higher entropy).
11	Project-Scale Plausibility	−5 to +20	The big-picture sanity check. Compares total authored output against the repo's true creation date (fetched from GitHub metadata) and active coding days. A senior engineer produces ~200–500 lines of production code per day — 10,000+ lines/day sustained over weeks is implausible without LLM assistance.
12	Generated File Ratio	Informational	Reports what percentage of line changes are in generated/vendor files (excluded from all calculations above).

Note: The score uses both positive signals (suspicious patterns push the score up) and negative signals (clearly human patterns actively pull it down). The final score is clamped to 0–100. Patch content from commit diffs is analyzed for comment density and entropy calculations.

Velocity Thresholds

Threshold	Lines/Min	Lines/Hr	Interpretation
Clearly Human	< 0.5	< 30	Normal productive human
Human Upper	< 1.5	< 90	Fast human, maybe some copy-paste
Suspicious	≥ 4.0	≥ 240	Quite fast, could be assisted
Very Suspicious	≥ 10.0	≥ 600	Almost certainly not hand-typed

Daily Output Thresholds

Lines/Active Day	Interpretation	Points
< 300	Normal human pace	−5 (if span ≥ 14 days)
300–799	Fast but plausible	0
800–1,999	Above average	+5
2,000–4,999	Very high, likely assisted	+12
≥ 5,000	Implausible for a human	+20

Comment Density Thresholds

Comment Ratio	Interpretation	Points
< 5%	Human laziness	−3
5–24%	Normal range	0
25–34%	Above average	+3
≥ 35%	LLM over-commenting	+5

Diff Entropy Thresholds

Entropy (bits/char)	Interpretation	Points
> 5.5	Varied, chaotic (human)	−3
4.3–5.5	Normal range	0
4.0–4.3	Below average	+3
< 4.0	Repetitive/formulaic (LLM)	+5

Verdicts

Score	Verdict
75–100	🤖 Very likely LLM-generated
50–74	🤖 Likely LLM-assisted
30–49	🤔 Possibly LLM-assisted
15–29	👤 Likely human-written
0–14	👤 Almost certainly human-written

Generated File Filtering

The following file types are automatically excluded from velocity, size, and daily output calculations:

Lock files: package-lock.json, yarn.lock, Podfile.lock, Cargo.lock, go.sum, etc.
Xcode / Apple: .pbxproj, .xcworkspacedata, .xcscheme
Protobuf / codegen: .pb.go, .pb.swift, _pb2.py, .g.dart, .freezed.dart, .generated.*
Build artifacts: .min.js, .min.css, .map, dist/, build/, vendor/, node_modules/
Data / assets: .json, .svg, .png, .jpg, .ico, fonts

Bot Author Filtering

Authors matching known bot patterns (e.g. dependabot[bot], renovate-bot) are automatically:

Excluded from velocity and session calculations
Excluded from author counts (so a solo dev + dependabot is correctly identified as a solo author)
Skipped during commit detail fetching (saves API calls)

Installation

Zero dependencies — runs on Python 3.10+ with only the standard library.

Option A: pip install (recommended)

pip install git+https://github.com/gotnull/likelihoodlum.git

Then run it from anywhere:

likelihoodlum owner/repo

Option B: Clone and run

git clone https://github.com/gotnull/likelihoodlum.git
cd likelihoodlum
python3 llm_detector.py owner/repo

Optional: `.env` file support

Install python-dotenv for .env file support (a built-in fallback parser is included if you don't):

pip install python-dotenv

Setup

GitHub Token (Recommended)

Without a token you're limited to 60 API requests/hour. With one, you get 5,000/hr.

Option A: .env file (recommended)

cp .env.example .env
# Edit .env and add your token

GITHUB_TOKEN=ghp_your_token_here

The .env file is gitignored by default — your token stays safe.

Option B: Environment variable

export GITHUB_TOKEN="ghp_your_token_here"

Option C: CLI flag

python3 llm_detector.py owner/repo --token ghp_your_token_here

Generating a Token

Go to GitHub → Settings → Developer settings → Personal access tokens
Generate a new token (classic) with public_repo scope (or repo for private repos)
Copy it — you won't see it again

Usage

# Basic — analyze a public repo
python3 llm_detector.py owner/repo

# Full GitHub URL works too
python3 llm_detector.py https://github.com/owner/repo

# Analyze more commits (default is 200)
python3 llm_detector.py owner/repo --max-commits 1000

# Target a specific branch
python3 llm_detector.py owner/repo --branch develop

# Machine-readable JSON output
python3 llm_detector.py owner/repo --json

# Go big
python3 llm_detector.py owner/repo --max-commits 5000 --json > report.json

CLI Reference

Flag	Default	Description
`repo` (positional)	—	GitHub repo as `owner/repo` or full URL
`--token`	`$GITHUB_TOKEN`	GitHub personal access token
`--branch`	repo default	Branch to analyze
`--max-commits`	`200`	Maximum number of commits to fetch
`--json`	off	Output results as JSON

Example Output

============================================================
  LLM Code Detector Report
  Repository: anthropics/claudes-c-compiler
============================================================

📊 Commits analyzed: 200
📅 Time span: 0 days
👥 Authors: 2
   • claude: 198 commits
   • carlini: 2 commits

📁 Line changes breakdown:
   Total:     60,638
   Authored:  60,638 (used for analysis)
   Generated: 0 (filtered out)

📈 Project-scale output:
   Repo created:         2026-02-04
   Active days:          1 (of 1 calendar days)
   Lines/active day:     60,638
   Lines/calendar day:   60,638
    ⚠️  (>5,000 — implausible for a human)

⚡ Velocity (authored lines/min between commits):
   Median:        7.61  (≈ 457 lines/hr)
   Trimmed mean:  12.14  (≈ 728 lines/hr)
   Max:           630.41
   Intervals above suspicious threshold: 69/101

🔥 Fastest commit intervals:
   12d83516→dc196034  767 lines in 1.2 min = 630.41 l/min ⚠️
   f2ac8159→8fe4994a  163 lines in 1.0 min = 160.33 l/min ⚠️
   734d5fab→e836df40  1029 lines in 7.2 min = 143.25 l/min ⚠️
   ...

🕐 Coding sessions (gap > 120 min): 2

💬 Commit messages matching LLM patterns: 4/200 (2.0%)
   • "Update x86 assembler README: fix stale line counts"
   • "Add missing ARM64 system registers to assembler"

────────────────────────────────────────────────────────────
  🎯 LLM Likelihood Score: [████████████████████████······] 81/100
  🤖 Very likely LLM-generated
────────────────────────────────────────────────────────────

📝 Reasoning:
   • Median velocity is suspiciously high (7.6 lines/min ≈ 457 lines/hr)
   • Trimmed mean (12.1 l/min) is 1.6× the median — heavy tail of fast intervals
   • 44% of intervals show very high velocity
   • Median session productivity is extreme (78.5 lines/min)
   • Commit sizes vary widely — typical of human work (CV=8.97) [-5]
   • Commit messages look clean but velocity is high — possible curated LLM workflow
   • 12 commit intervals (12%) show extreme velocity (>50 lines/min) [+4]
   • Project-scale output is implausible: 60,638 authored lines over 1 days
     (1 active) = 60,638 lines/active day [+20]

⚠  Disclaimer: This is a heuristic analysis and NOT definitive proof.
   Fast coding can also indicate copy-paste, boilerplate generators,
   IDE scaffolding, or simply an experienced developer.

JSON Output

When using --json, the output includes:

{
  "repository": "owner/repo",
  "commits_analyzed": 200,
  "score": 81,
  "verdict": "🤖 Very likely LLM-generated",
  "reasons": ["..."],
  "velocity_stats": {
    "median_lpm": 7.61,
    "trimmed_mean_lpm": 12.14,
    "intervals": 101
  },
  "line_changes": {
    "total": 60638,
    "authored": 60638,
    "generated": 0
  },
  "project_scale": {
    "repo_created_at": "2026-02-04T00:00:00+00:00",
    "calendar_days": 1,
    "active_days": 1,
    "lines_per_active_day": 60638
  },
  "message_analysis": {
    "total": 200,
    "pattern_hits": 4,
    "ratio": 0.02,
    "sample_flagged": ["..."]
  },
  "sessions": 2,
  "authors": 2
}

API Usage

The tool fetches repo metadata (1 call), commit listings (1–N calls depending on page count), and detailed stats per non-bot commit. Bot commits are skipped to save calls. Detail requests are made concurrently (up to 10 in parallel) for faster analysis.

Auth	Rate Limit	Max Commits Comfortable
No token	60/hr	~50
With token	5,000/hr	~4,000

Limitations & Disclaimer

This tool uses heuristics, not magic. A high score doesn't prove LLM usage, and a low score doesn't disprove it.

False positives can come from:

Copy-pasting code from other projects
IDE/framework scaffolding and boilerplate generators
Squashed/rebased commits that compress work
Merge commits (maintainers merging large PRs)
An experienced developer who plans before coding
Generated code (protobufs, OpenAPI, etc.)

False negatives can come from:

LLM-generated code committed slowly or in small chunks
Human-edited LLM output committed as normal work
Commits with manual timing that mimics human patterns
Repos where --max-commits doesn't capture the full picture

Use responsibly. This is a curiosity tool, not a courtroom exhibit.

License

MIT — do whatever you want with it.

Contributing

Found a new heuristic? PRs welcome. Ideas:

File-type breakdown (LLMs love generating configs)
Code style consistency metrics
Cross-file similarity detection (LLMs repeat patterns)
Cross-referencing with known LLM output patterns
Language-specific signal tuning
Timezone inference from commit patterns

Built with vibes and a healthy suspicion of anyone committing 10,000 lines a day. 🎲

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
llm_detector.py		llm_detector.py
pyproject.toml		pyproject.toml

Uh oh!

License

gotnull/likelihoodlum

Folders and files

Latest commit

History

Repository files navigation

🎲 Likelihoodlum

🏆 The Wall of Truth

🤖 Caught Red-Handed

🤔 Suspicious — You Decide

👤 Certified Human

🔬 The Anthropic Spotlight

How It Works

Scoring Signals

Velocity Thresholds

Daily Output Thresholds

Comment Density Thresholds

Diff Entropy Thresholds

Verdicts

Generated File Filtering

Bot Author Filtering

Installation

Option A: pip install (recommended)

Option B: Clone and run

Optional: .env file support

Setup

GitHub Token (Recommended)

Generating a Token

Usage

CLI Reference

Example Output

JSON Output

API Usage

Limitations & Disclaimer

License

Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Optional: `.env` file support

Packages