On October 24, 2024 at 11:15AM UTC, we observed elevated latency and errors in the Presence service across our global points of presence. Affected customers may have experienced a slowdown in Presence request responses and/or failures with 5XX server errors returned. After investigating, we identified the cause of the issue, blocked the source of traffic causing it, and the issue was resolved on October 24, 2024 at 2:00PM UTC. This issue occurred because our services were not auto scaled appropriately in response to a spike in unexpected traffic from non-standard usage of the Presence service.
To prevent a similar issue from occurring in the future, we have addressed the source of the unexpected traffic spike directly, ensuring changes were made to align usage with our prescribed methods for Presence. Additionally, we are working in the coming days to deploy sharding in Presence infrastructure to enhance scalability and better manage traffic surges like this, should they recur.