Major Outage Hits Google Cloud and Linked Cloudflare Services, Thousands Affected

Major Outage Hits Google Cloud and Linked Cloudflare Services, Thousands Affected

On June 12, 2025, concurrent infrastructure failures at Cloudflare and Google caused widespread service disruptions, highlighting vulnerabilities in modern cloud dependencies.

The outages impacted critical services ranging from authentication systems to AI platforms, underscoring the fragility of interconnected internet ecosystems.

Cloudflare Outage:

Cloudflare’s outage began at 17:52 UTC when internal monitoring detected failures in device registrations for its Zero Trust WARP service.

– Advertisement –

The root cause was traced to a failure in a third-party cloud provider’s storage infrastructure, which supported Cloudflare’s Workers KV service—a distributed key-value store used for configuration, authentication, and asset delivery.

Workers KV experienced a 90.22% request failure rate, triggering cascading failures across dependent services:

Service Impact
Workers KV 90.22% of cold read/write requests failed; central storage outage disrupted global operations.
Zero Trust Access Authentication flows broke for new sessions; 100% failure rate for device posture checks.
Gateway Proxy, egress, and TLS decryption services failed; policy evaluations halted.
WARP New client registrations blocked; existing sessions using Gateway proxy disrupted.
Stream Video playlists returned 90% errors; Live streams experienced 100% failure rates.
Workers AI 100% inference request failures due to missing KV-based routing configurations.

Engineers initiated mitigation at 19:32 UTC by shedding non-critical KV load and deploying emergency failovers to Cloudflare’s R2 storage.

Services began recovering by 20:23 UTC, with full restoration achieved by 20:57 UTC.

Concurrently, Google Cloud suffered a 14-hour outage starting at 10:51 PDT (17:51 UTC) due to an identity and access management (IAM) subsystem failure.

The misconfiguration disrupted authentication tokens and policy evaluations, causing:

  • Gmail, Drive, Calendar, Meet: 100% unavailability for users requiring new authentication.
  • Google Search, Lens, Discover: Partial degradation in serving results.
  • Third-party platforms: Spotify and Discord experienced outages due to Google Cloud dependencies.

Mitigation began at 12:41 PDT with regional failovers, restoring most services by 15:16 PDT, though residual issues persisted in us-central1 until 23:00 PDT.

Technical Analysis and Industry Implications

Both incidents exposed critical risks in vendor lock-in and centralized dependencies:

  • Cloudflare’s architecture: Heavy reliance on Workers KV—a “coreless” service designed for redundancy—failed due to an external single point of failure.
  • Engineers are accelerating migration to R2 storage and implementing “graceful degradation” protocols for critical services like Access.
  • Google’s IAM breakdown: A misconfigured IAM policy update bypassed safeguards, causing cascading authentication failures across 60+ regions.

While Cloudflare and Google emphasized these outages were unrelated, their overlap disrupted workflows for millions.

Both companies have committed to detailed postmortems, with Cloudflare pledging infrastructure diversification and Google overhauling IAM deployment safeguards.

These incidents underscore the need for cross-cloud redundancy and real-time dependency monitoring in critical internet infrastructure.

Find this News Interesting! Follow us on Google News, LinkedIn, & X to Get Instant Updates


Source link