We are experiencing partial outages in region eu-central-2
Updates
Dear customers,
the issue resulting in Performance Degradations and temporary unavailability of our S3 endpoint in eu-central-2 has been resolved. We have observed that service reliability has returned to normal levels during the last night. Below you can find the post-mortem summary on this incident.
Post-Mortem Summary
Between 2nd and 5th of May, customers using our eu-central-2 S3 API experienced reduced performance and some periods of unavailability. During this time, we temporarily applied request limits, returning 503 responses to keep the service stable while we investigated. These limits were applied for the whole region, affecting all customers. We are sorry for the disruption this caused.
The underlying cause was in our metadata database — the system that tracks where every object lives and serves listings, retention information and lock information for your objects. Several things came together at once: natural growth in customer traffic, a planned migration that moved part of the database onto new hardware in one of our Frankfurt data centres, and an internal cleanup job related to that migration that was running in an inefficient way and adding far more load to the database than usual. The combination produced a workload pattern the database was not tuned for. Internally, the database fell behind on some routine housekeeping work, which then slowed down regular requests. That slowdown built up in our S3 service and caused the errors and slow-downs customers saw.
We stopped the inefficient cleanup job on the first night of the incident, but symptoms continued. After more investigations we adjusted two configuration values that govern how aggressive the database protects itself, while doing so rejecting some write operations and hence increasing SQL latencies. The change took effect immediately: the backlog cleared, error rates returned to normal, and the service has been stable since — even under traffic levels higher than before the incident.
We have already made the configuration change permanent and rewritten the cleanup job so it can no longer overload the database in the same way. We are also working on additional safeguards in our S3 service so that, if the metadata layer ever slows down again, the impact on customer requests is contained earlier and more gracefully.
Thank you for your understanding
Your Engineering Team
Dear Customers,
we have found the root cause of the issue and applied a fix. Service reliability has returned to normal levels. We will be monitoring the system to ensure stability. Some endpoints continue having an elevated rate of 503 Slow down errors while the system stabilizes.
Thank you for your patience
Your Engineering Team
Dear Customers,
we still experience performance degradations and intermittent errors. We are actively investigating the issue.
Thank you for your patience
Your Engineering Team
Dear customers,
the issue was found and we have implemented a fix. Service reliability has returned to normal levels. We will monitoring the system to ensure stability.
Thank you for your understanding, and we appreciate your continued trust.
Dear customers,
we are currently experiencing partial outages in region eu-central-2 affecting our S3 endpoint. Customers may be unable to access, upload, or retrieve data through the service at this time.
Our engineering team is actively investigating the root cause and working to restore functionality as quickly as possible. This issue is being treated as a critical priority.
We sincerely apologize for the disruption and appreciate your patience while we resolve the issue.
← Back
Impossible Cloud Status