Tim Stacey — Field notes

Tim Stacey — Field notesField notes on testing and quality engineering by Tim Stacey.https://tim.sillysamoyed.com/Where Test Health Belongs: CI Logs or an Observability Backendhttps://tim.sillysamoyed.com/blog/test-health-ci-logs-vs-observabilityhttps://tim.sillysamoyed.com/blog/test-health-ci-logs-vs-observabilityYour suite emits pass rate and flake count every run, then buries them in a CI log nobody scrolls; export them over OTLP and a dashboard catches the rot.Tue, 09 Jun 2026 00:00:00 GMTTestAutomationOpenTelemetryTestObservabilityCICDDevOpsRetrying a flaky test deletes the evidence of a real bughttps://tim.sillysamoyed.com/blog/retries-hide-real-bugshttps://tim.sillysamoyed.com/blog/retries-hide-real-bugsA bug that fails one run in four passes CI 99.6 percent of the time under three retries. Quarantine the test instead and keep the signal.Sun, 07 Jun 2026 00:00:00 GMTTestAutomationFlakyTestsCICDSoftwareTestingDevOpsThree caching changes that take 80% off a GitHub Actions buildhttps://tim.sillysamoyed.com/blog/github-actions-cache-strategyhttps://tim.sillysamoyed.com/blog/github-actions-cache-strategyA cached ~/.npm drops a cold Node install from four minutes to thirty seconds, and two more cache changes take the rest of the pipeline down with it.Thu, 04 Jun 2026 00:00:00 GMTGitHubActionsCICDDevOpsSoftwareDevelopmentTestAutomationMy resume site ships behind 460 testshttps://tim.sillysamoyed.com/blog/resume-site-behind-460-testshttps://tim.sillysamoyed.com/blog/resume-site-behind-460-testsI set the direction and Claude Code wrote the code and the tests; 247 unit tests and 213 browser tests are how I trust a site I never hand-wrote.Thu, 04 Jun 2026 00:00:00 GMTAstroStaticSitePlaywrightContinuousIntegrationTestAutomationGitHub Actions parallel steps and the matrix jobs you can retirehttps://tim.sillysamoyed.com/blog/github-actions-parallel-stepshttps://tim.sillysamoyed.com/blog/github-actions-parallel-stepsThree matrix jobs for lint, type-check, and unit tests pay three runner boots and an artifact handoff for concurrency that parallel steps fold back into one job.Tue, 02 Jun 2026 00:00:00 GMTGitHubActionsCICDDevOpsTestAutomationSoftwareDevelopmentContract Testing vs End-to-End: Where Integration Bugs Belonghttps://tim.sillysamoyed.com/blog/contract-testing-vs-e2ehttps://tim.sillysamoyed.com/blog/contract-testing-vs-e2eA contract test catches a renamed field in seconds; a 20-minute E2E suite catches it after booting six services. Put each test where it earns its minutes.Mon, 01 Jun 2026 00:00:00 GMTContractTestingMicroservicesAPITestingTestAutomationCICDk6 Script Authoring calibrates load tests to live traffichttps://tim.sillysamoyed.com/blog/k6-script-authoring-live-telemetryhttps://tim.sillysamoyed.com/blog/k6-script-authoring-live-telemetryGrafana Assistant reads your telemetry, finds endpoints by real RPS and p95, and generates a k6 script that inherits that profile.Tue, 26 May 2026 00:00:00 GMTPerformanceTestingk6GrafanaTestAutomationDevOpsWhen AI can write every test, what ships to CI is the jobhttps://tim.sillysamoyed.com/blog/playwright-ai-test-explosionhttps://tim.sillysamoyed.com/blog/playwright-ai-test-explosionAI-generated Playwright tests flake under 1.5%. The new problem is test explosion, and coverage intent is still yours to define.Sun, 24 May 2026 00:00:00 GMTPlaywrightTestAutomationSoftwareTestingAICICDOne click to fix a failing GitHub Actions runhttps://tim.sillysamoyed.com/blog/github-copilot-fixes-failing-cihttps://tim.sillysamoyed.com/blog/github-copilot-fixes-failing-ciFix with Copilot puts a cloud agent on the failure: it investigates, pushes a fix, reruns CI, and tags you for review.Thu, 21 May 2026 00:00:00 GMTGitHubActionsCICDTestAutomationDevOpsSoftwareDevelopmentPlaywright90% use AI in the IDE; the pipeline is another storyhttps://tim.sillysamoyed.com/blog/ai-cicd-adoption-gaphttps://tim.sillysamoyed.com/blog/ai-cicd-adoption-gapJetBrains data: daily AI in the editor, almost none in CI/CD. The trust gap closes when AI reduces noise instead of adding it.Tue, 19 May 2026 00:00:00 GMTCICDDevOpsTestAutomationSoftwareDevelopmentAITestingBitbucket Agentic Pipelines automates the choreshttps://tim.sillysamoyed.com/blog/bitbucket-agentic-pipelineshttps://tim.sillysamoyed.com/blog/bitbucket-agentic-pipelinesDefine an agent block in bitbucket-pipelines.yml, scope it, tie it to an event. It drafts the docs and the coverage gaps; you review.Sun, 17 May 2026 00:00:00 GMTBitbucketDevOpsCICDTestAutomationSoftwareDevelopmentPlaywrightPlaywright 1.59 turns failures into reviewable evidencehttps://tim.sillysamoyed.com/blog/playwright-1-59-healer-agent-cihttps://tim.sillysamoyed.com/blog/playwright-1-59-healer-agent-ciThe 1.59 agents plus screencast and browser.bind shift your job from chasing selectors to reviewing what the Healer did.Thu, 14 May 2026 00:00:00 GMTPlaywrightTestAutomationAITestingQACIk6 2.0 moves load-test authoring into the CLIhttps://tim.sillysamoyed.com/blog/k6-2-ai-performance-testinghttps://tim.sillysamoyed.com/blog/k6-2-ai-performance-testingGrafana previewed k6 2.0 at GrafanaCON 2026: AI authoring in the CLI, an MCP server, and a Playwright-to-k6 converter.Tue, 12 May 2026 00:00:00 GMTPerformanceTestingTestAutomationk6GrafanaAIThe locator tax nobody puts in the budgethttps://tim.sillysamoyed.com/blog/playwright-mcp-locator-taxhttps://tim.sillysamoyed.com/blog/playwright-mcp-locator-taxBroken-test triage is a staffing decision disguised as a process one. Here is the cost, and where AI self-healing pays it back.Tue, 12 May 2026 00:00:00 GMTEngineeringLeadershipSoftwareEngineeringTestAutomationDevProductivityQualityAssurancePlaywright agents and the new QA skills gaphttps://tim.sillysamoyed.com/blog/playwright-ai-agentshttps://tim.sillysamoyed.com/blog/playwright-ai-agentsPlaywright v1.56 put a Planner, Generator, and Healer in the test runner. The interesting part is what it asks of the engineers who own the suite.Sun, 10 May 2026 00:00:00 GMTPlaywrightSoftwareTestingAITestAutomationQualityAssurance