Login

We felt it time to address the intermittent service disruptions you may have experienced with Scorekeeper over the past few weeks, including lag, errors, and temporary outages. We understand that League night is important, and when our product does not perform reliably, it disrupts both your game and your trust. We sincerely apologize for these disruptions and want to be transparent about what happened, and more importantly, what we are doing to reduce the likelihood of it happening again.

Our engineering team has completed detailed investigations into each incident and identified clear root causes.

Summary of Recent Outages

The recent issues were driven by periods of unusually high stress on our core database and services, caused by a combination of independent factors:

  • February 19, General Performance Degradation:
    A poorly performing database query, combined with peak usage, led to elevated error rates, including 500 Internal Server Errors and Gateway Timeouts.
  • March 1, Database Lock Contention:
    A brief outage occurred when many users attempted to write scoring data simultaneously, resulting in database lock contention on critical records.
  • March 9, Runaway Process and “Retry Storm”:
    A background process responsible for maintaining search indexes began duplicating its work and consuming excessive resources. At the same time, the mobile app’s retry logic, designed to improve reliability, caused a surge of repeated requests during the disruption. This created a “retry storm,” roughly doubling normal database load and prolonging the outage.
  • March 16, High-Frequency Requests from App Update:
    An Android app update unintentionally increased how often a specific backend request was made. While the underlying query had historically performed adequately, it was not designed for this sudden spike in frequency, leading to system-wide saturation.

What We Are Doing

These incidents were not caused by a single failure, but by complex interactions within a large-scale system. In response, we are implementing both immediate fixes and long-term architectural improvements to strengthen reliability.

Improving Concurrency and Process Control
We have added safeguards to ensure that critical background processes cannot run multiple times simultaneously, preventing resource conflicts like those seen on March 9.

Strengthening Client-Side Resiliency
We are refining retry behavior in the Scorekeeper app by improving exponential backoff logic and introducing circuit breaker patterns. These industry-standard protections prevent excessive retries from overwhelming the system during partial outages.

Managing Request Volume More Effectively
We are shifting focus from simply optimizing query speed to controlling how often requests are made. This includes fixing client-side triggers, introducing caching and deduplication, and adding protections around high-cost operations.

Enhancing Infrastructure Awareness
We are expanding our testing to better simulate real-world usage patterns, including high-frequency access scenarios. We are also implementing more gradual app rollouts with stronger monitoring to catch issues before they scale.

Preventative Measures Now in Place

To deliver the consistent experience you expect, we have already begun implementing the following safeguards:

Monitoring and Alerting

  • Automated alerts for long-running processes and slow database queries
  • Improved visibility into request frequency and per-device behavior

Database Integrity and Stability

  • Database-level constraints to prevent duplicate or inconsistent data
  • Active cleanup of any previously impacted records

Looking Ahead

We recognize the importance of reliability on League night, and we take that responsibility seriously. The changes above represent meaningful improvements to how our systems handle load, recover from issues, and prevent cascading failures.

Based on the fixes already implemented and the safeguards now in place, we are highly confident that tonight, and going forward, Scorekeeper will operate with the smooth, dependable performance you expect from the APA.

We appreciate your patience and continued trust as we strengthen the platform.

 

Share This
Skip to consent banner