Starting at 12:20 pm UTC on June 2, 2025, we noticed an increase in errors and latency affecting the Presence service in multiple regions. After investigating the increase in errors and latency, we identified the cause to be a higher-than-normal traffic pattern. In response, we scaled up resources to handle the request load, and the Presence service was fully restored in the affected regions by 1:51 PM UTC on June 2, 2025.
To prevent a similar issue from occurring in the future, we are tuning our autoscaling configuration, as well as analyzing system behavior and bottlenecks observed during the incident window to inform further adjustments.