You write Playwright tests on a Friday. They pass locally. You push, and three fail on the checkout flow with two more flaking at random. You open traces, fix locators, push again. Two weeks later you are still at it.
Playwright v1.59, released April 1st, ships a different loop. Three agents cover the test lifecycle: the Planner turns a feature description into a Markdown spec, the Generator turns that spec into TypeScript with role-based locators and fixtures, and the Healer runs failing tests, reads accessibility-tree snapshots, tells a stale locator from a real bug, patches the locator, and confirms the fix passes.
Start the loop
One command wires the agents into your editor:
npx playwright agents --loop=vscodeFeed the Generator a seed spec that shows your locator conventions and fixture style, and it matches your project’s patterns instead of inventing its own. Debbie O’Brien’s walkthrough covers the plan-generate-heal flow end to end.
screencast: the fix comes with a video
The new page.screencast API records a run as video with chapter markers and action annotations. When the Healer repairs a checkout test, that video lands in CI artifacts with a “Checkout” chapter you scrub to. No log archaeology.
test('checkout', async ({ page }) => { const cast = await page.screencast({ chapters: true }); await page.getByRole('button', { name: 'Pay now' }).click(); await cast.stop(); // lands in CI artifacts, annotated});browser.bind: watch the Healer work
browser.bind() exposes a running browser under a named session, so your agent, MCP server, and CLI debugger share one browser. You attach from a second terminal, watch the Healer reason through a failure, step in if it stalls, and detach without killing the run. One engineer who built a similar fix by hand called this the part that finally makes self-healing observable.
A passing suite used to prove your locators held. Now it proves the Healer kept them holding. Your review process should know which it trusts.
The work moves up a level
You stop debugging selectors and start reviewing agent-produced evidence: the video, the patched locator, the Healer’s stale-vs-regression call. Coverage decisions and novel failures become the actual job, which is where a senior tester’s time belongs. The 2026 ecosystem survey and TestDino’s rundown both land here: the agents amplify whatever foundation you give them, so seed them with good conventions and the output stays good.