At 18:09 UTC on September 2, 2025 we observed increased error rates and latency for our Presence service in our San Jose, Virginia, and Tokyo regions. We increased capacity in those regions and the issue was resolved at 18:16 UTC. This issue occurred because a bug in one of our APIs allowed a request to execute an operation that exceeded assumed limits in extreme cases. In this case, a large number of such requests were executed that resulted in out-of-memory conditions for the Presence service.
To prevent a similar issue from occurring in the future we have enforced the intended limit on the API in question. We have also added additional testing and monitoring in this area.