Our beloved hosting company, Amazon Web Services or AWS for short is still crook.
We are having constant disk failures across multiple servers. It sure impacts on the end users because the site produces a huge number of errors.
There is little we can do at the moment. Waiting for AWS postmortem and possibly changes to their SLA.
The promise of their service is high reliability of the storage and zone isolation. This recent outage shows that wasn’t the case. The outage spanned the entire region.
Jumping the ship and rushing to move elsewhere would be unwise. It’s unlikely Rackspace or some other provider is any better. What we need to do is add more redundancy and better contingency to the service.