Tim Stacey — Field notes

Tim Stacey — Field notes Field notes on testing and quality engineering by Tim Stacey. https://tim.sillysamoyed.com/blog 2026-06-09T00:00:00.000Z Tim Stacey Where Test Health Belongs: CI Logs or an Observability Backend https://tim.sillysamoyed.com/blog/test-health-ci-logs-vs-observability 2026-06-09T00:00:00.000Z

Your suite emits pass rate and flake count every run, then buries them in a CI log nobody scrolls; export them over OTLP and a dashboard catches the rot.

Retrying a flaky test deletes the evidence of a real bug https://tim.sillysamoyed.com/blog/retries-hide-real-bugs 2026-06-07T00:00:00.000Z

A bug that fails one run in four passes CI 99.6 percent of the time under three retries. Quarantine the test instead and keep the signal.

Three caching changes that take 80% off a GitHub Actions build https://tim.sillysamoyed.com/blog/github-actions-cache-strategy 2026-06-04T00:00:00.000Z

A cached ~/.npm drops a cold Node install from four minutes to thirty seconds, and two more cache changes take the rest of the pipeline down with it.

My resume site ships behind 460 tests https://tim.sillysamoyed.com/blog/resume-site-behind-460-tests 2026-06-04T00:00:00.000Z

I set the direction and Claude Code wrote the code and the tests; 247 unit tests and 213 browser tests are how I trust a site I never hand-wrote.

GitHub Actions parallel steps and the matrix jobs you can retire https://tim.sillysamoyed.com/blog/github-actions-parallel-steps 2026-06-02T00:00:00.000Z

Three matrix jobs for lint, type-check, and unit tests pay three runner boots and an artifact handoff for concurrency that parallel steps fold back into one job.

Contract Testing vs End-to-End: Where Integration Bugs Belong https://tim.sillysamoyed.com/blog/contract-testing-vs-e2e 2026-06-01T00:00:00.000Z

A contract test catches a renamed field in seconds; a 20-minute E2E suite catches it after booting six services. Put each test where it earns its minutes.

k6 Script Authoring calibrates load tests to live traffic https://tim.sillysamoyed.com/blog/k6-script-authoring-live-telemetry 2026-05-26T00:00:00.000Z

Grafana Assistant reads your telemetry, finds endpoints by real RPS and p95, and generates a k6 script that inherits that profile.

When AI can write every test, what ships to CI is the job https://tim.sillysamoyed.com/blog/playwright-ai-test-explosion 2026-05-24T00:00:00.000Z

AI-generated Playwright tests flake under 1.5%. The new problem is test explosion, and coverage intent is still yours to define.

One click to fix a failing GitHub Actions run https://tim.sillysamoyed.com/blog/github-copilot-fixes-failing-ci 2026-05-21T00:00:00.000Z

Fix with Copilot puts a cloud agent on the failure: it investigates, pushes a fix, reruns CI, and tags you for review.

90% use AI in the IDE; the pipeline is another story https://tim.sillysamoyed.com/blog/ai-cicd-adoption-gap 2026-05-19T00:00:00.000Z

JetBrains data: daily AI in the editor, almost none in CI/CD. The trust gap closes when AI reduces noise instead of adding it.

Bitbucket Agentic Pipelines automates the chores https://tim.sillysamoyed.com/blog/bitbucket-agentic-pipelines 2026-05-17T00:00:00.000Z

Define an agent block in bitbucket-pipelines.yml, scope it, tie it to an event. It drafts the docs and the coverage gaps; you review.

Playwright 1.59 turns failures into reviewable evidence https://tim.sillysamoyed.com/blog/playwright-1-59-healer-agent-ci 2026-05-14T00:00:00.000Z

The 1.59 agents plus screencast and browser.bind shift your job from chasing selectors to reviewing what the Healer did.

k6 2.0 moves load-test authoring into the CLI https://tim.sillysamoyed.com/blog/k6-2-ai-performance-testing 2026-05-12T00:00:00.000Z

Grafana previewed k6 2.0 at GrafanaCON 2026: AI authoring in the CLI, an MCP server, and a Playwright-to-k6 converter.

The locator tax nobody puts in the budget https://tim.sillysamoyed.com/blog/playwright-mcp-locator-tax 2026-05-12T00:00:00.000Z

Broken-test triage is a staffing decision disguised as a process one. Here is the cost, and where AI self-healing pays it back.

Playwright agents and the new QA skills gap https://tim.sillysamoyed.com/blog/playwright-ai-agents 2026-05-10T00:00:00.000Z

Playwright v1.56 put a Planner, Generator, and Healer in the test runner. The interesting part is what it asks of the engineers who own the suite.