Server Errors and Elevated Latency

Incident Report for PubNub

Postmortem

Problem Description, Impact, and Resolution

At approximately 00:17 UTC on April 25, 2025, we observed elevated server errors and increased latency impacting multiple API endpoints, most notably the Presence service. Our engineering team immediately began investigating the issue. We identified that the root cause involved resource contention within a specific component of our backend infrastructure responsible for managing presence state. We implemented targeted configuration changes to better distribute this traffic and alleviate the resource contention. The issue was fully resolved by 02:32 UTC on April 25, 2025.

Mitigation Steps and Recommended Future Preventative Measures

To prevent a similar issue from occurring in the future, we have already implemented specific configuration changes to ensure the responsible backend component can more effectively handle the type of traffic pattern encountered. Furthermore, we are actively working to enhance our monitoring and alerting systems.

Posted Apr 25, 2025 - 19:39 UTC

Resolved

With no further issues observed, the incident has been resolved. We will follow up soon with a root cause analysis.

If you believe you experienced impact related to this incident, please report them to PubNub Support at support@pubnub.com.

Posted Apr 25, 2025 - 02:32 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Apr 25, 2025 - 02:11 UTC

Identified

The issue has been identified and a fix is being implemented.

Posted Apr 25, 2025 - 02:05 UTC

Update

The PubNub Technical Staff continues to investigate. Errors and elevated latency are with some Presence customers. The real-time network services are operational.

Posted Apr 25, 2025 - 01:57 UTC

Update

The PubNub Technical Staff is investigating. More updates will follow once available.

Posted Apr 25, 2025 - 01:22 UTC

Investigating

At about 12:17 AM UTC, we started to experience elevated latencies and server errors in all PoPs. PubNub Technical Staff is currently investigating and more updates will follow once available.
If you are experiencing issues and believe them to be related to this incident, please report them to PubNub Support at support@pubnub.com.

Posted Apr 25, 2025 - 00:51 UTC

This incident affected: Realtime Network (Publish/Subscribe Service, Presence Service, Access Manager Service, Mobile Push Gateway) and Points of Presence (North America Points of Presence, Asia Pacific Points of Presence).