Back to overview
Downtime
Elevated Inference Error Rates Due to a DNS Outage
Feb 10 at 04:50pm UTC
Affected services
GPU Cluster (General)
Resolved
Feb 10 at 05:55pm UTC
The issue has been resolved and the system is now stable.
We are continuing to actively monitor the affected services.
Affected services
GPU Cluster (General)
Created
Feb 10 at 04:50pm UTC
A DNS outage is causing networking issues across all GPU clusters. As a result, we have seen increased error rates for all inference requests.
These networking issues are preventing job requests and job responses from flowing between the API server and the inference servers.
Affected services
GPU Cluster (General)