// blog.md

Field notes from someone who tests for a living.

published: 15
cadence: when it matters
written in: markdown + vim

// post 015

featured · latest

Where Test Health Belongs: CI Logs or an Observability Backend

Your suite emits pass rate and flake count every run, then buries them in a CI log nobody scrolls; export them over OTLP and a dashboard catches the rot.

09 Jun 2026 ~5min Strategy

#TestAutomation #OpenTelemetry #TestObservability #CICD #DevOps

▸ cat posts/test-health-ci-logs-vs-observability.md

cat test-health-ci-logs-vs-observability.md UTF-8 · markdown

$ cat test-health-ci-logs-vs-observability.md

# # Where test health belongs

Every run, your suite reports its pass

rate and flake count. The number lands

in a CI log and nobody scrolls back.

Export it over OTLP and pass rate flows

into the same Grafana board as prod...

// published

15 posts · newest first

014 Retrying a flaky test deletes the evidence of a real bug Strategy A bug that fails one run in four passes CI 99.6 percent of the time under three retries. Quarantine the test instead and keep the signal. 07 Jun 2026 ~5min read → 07 Jun 2026 ~5min read → 013 Three caching changes that take 80% off a GitHub Actions build Practice A cached ~/.npm drops a cold Node install from four minutes to thirty seconds, and two more cache changes take the rest of the pipeline down with it. 04 Jun 2026 ~4min read → 04 Jun 2026 ~4min read → 012 My resume site ships behind 460 tests Meta I set the direction and Claude Code wrote the code and the tests; 247 unit tests and 213 browser tests are how I trust a site I never hand-wrote. 04 Jun 2026 ~6min read → 04 Jun 2026 ~6min read → 011 GitHub Actions parallel steps and the matrix jobs you can retire Tools Three matrix jobs for lint, type-check, and unit tests pay three runner boots and an artifact handoff for concurrency that parallel steps fold back into one job. 02 Jun 2026 ~3min read → 02 Jun 2026 ~3min read → 010 Contract Testing vs End-to-End: Where Integration Bugs Belong Strategy A contract test catches a renamed field in seconds; a 20-minute E2E suite catches it after booting six services. Put each test where it earns its minutes. 01 Jun 2026 ~4min read → 01 Jun 2026 ~4min read → 009 k6 Script Authoring calibrates load tests to live traffic Tools Grafana Assistant reads your telemetry, finds endpoints by real RPS and p95, and generates a k6 script that inherits that profile. 26 May 2026 ~6min read → 26 May 2026 ~6min read → 008 When AI can write every test, what ships to CI is the job Strategy AI-generated Playwright tests flake under 1.5%. The new problem is test explosion, and coverage intent is still yours to define. 24 May 2026 ~6min read → 24 May 2026 ~6min read → 007 One click to fix a failing GitHub Actions run Tools Fix with Copilot puts a cloud agent on the failure: it investigates, pushes a fix, reruns CI, and tags you for review. 21 May 2026 ~5min read → 21 May 2026 ~5min read → 006 90% use AI in the IDE; the pipeline is another story Strategy JetBrains data: daily AI in the editor, almost none in CI/CD. The trust gap closes when AI reduces noise instead of adding it. 19 May 2026 ~6min read → 19 May 2026 ~6min read → 005 Bitbucket Agentic Pipelines automates the chores Tools Define an agent block in bitbucket-pipelines.yml, scope it, tie it to an event. It drafts the docs and the coverage gaps; you review. 17 May 2026 ~6min read → 17 May 2026 ~6min read →