Add optional Playwright Chrome correctness driver by chenglou · Pull Request #112 · chenglou/pretext

chenglou · 2026-04-07T18:50:18Z

Summary

add an optional Chrome-only Playwright correctness driver behind CHROME_AUTOMATION_DRIVER=playwright
pin headless Chrome to the validated screen environment instead of silently trusting default headless layout
keep benchmark runs on the existing foreground AppleScript path and fail loudly if someone tries to use the Playwright path there

Rationale

We now have a mechanically checked-in Chrome correctness baseline in corpora/chrome-step10.json, so we can evaluate a new browser driver by generating the same machine-readable sweep and diffing it directly.

That made Chrome a good candidate for trying Playwright:

Chrome correctness currently goes through AppleScript, which causes focus blips and is less pleasant for day-to-day work
Firefox was already on a protocol path, so it did not have the same upside
Safari still needs to stay real Safari, so Playwright WebKit is not a drop-in replacement there

The important constraint is that this is only for correctness, not benchmarks.

During the investigation:

plain headless Chrome diverged from the checked-in corpus status
headed Playwright Chrome matched corpus status, but benchmark numbers came back systematically slower than the current benchmark snapshot
pinned headless Chrome with a validated screen environment matched corpus status exactly

So this PR takes the conservative split:

Chrome correctness: optional Playwright path
Chrome benchmarks: unchanged AppleScript foreground path
no silent fallback between the two

Implementation notes

the Playwright path is only selected when CHROME_AUTOMATION_DRIVER=playwright
it is Chrome-only and non-foreground only
it uses headless Chrome with --screen-info={3024x1964 devicePixelRatio=2}
it asserts the pinned environment from inside the page before trusting the run
benchmark callers still pass foreground: true, so the Playwright path errors immediately there

Verification

bun run check
CHROME_AUTOMATION_DRIVER=playwright bun run accuracy-check → 7680/7680
CHROME_AUTOMATION_DRIVER=playwright bun run corpus-sweep --all --start=300 --end=900 --step=10 --output=/tmp/pretext-playwright-step10-pr.json
mechanical diff of /tmp/pretext-playwright-step10-pr.json against corpora/chrome-step10.json → 0 diffs
CHROME_AUTOMATION_DRIVER=playwright bun run benchmark-check fails loudly with the expected guardrail message

Follow-up questions

whether to add convenience scripts for the Playwright correctness path
whether this should stay opt-in indefinitely or become the default Chrome correctness driver later

Add optional Playwright Chrome correctness driver

8eeb896

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional Playwright Chrome correctness driver#112

Add optional Playwright Chrome correctness driver#112
chenglou wants to merge 1 commit intomainfrom
codex/chrome-playwright-correctness

chenglou commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chenglou commented Apr 7, 2026

Summary

Rationale

Implementation notes

Verification

Follow-up questions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant