2018-10-22T20:48:22.385848+00:00
Incident impacting MONTREAL DATA CENTRE

Litespeed is now fully tested & operational, server side caching is available. The cluster is fully operational & stable.

An overview of this incident & the resolution will be posted here within 3 days.

Status remaining
Fully Operational
2018-10-24T14:43:48.021577+00:00

The canada300 cluster has been stable today, sysadmins have been performing additional testing throughout the day.

Litespeed will be enabled during a maintenance window of OCT. 24 - 4AM - 4:30AM ET

It possible that short outages will occur during this time, to allow for a restart & testing to ensure no cache corruption.

The results of that testing will be posted here tomorrow morning by 10AM ET.

Status remaining
Fully Operational
2018-10-23T21:12:11.955439+00:00

The issue on canada300 is now resolved, however our sysadmin team will be performing additional testing during a maintenance window between 3-5AM ET. It's possible that short outages will occur during this period. Please note that this will only affect the canada300 cluster.

Status remaining
Degraded Performance
2018-10-23T06:04:32.353942+00:00

Full transparency, we're seeing blank front page issues with approx. 10 sites, restoring the public_html is resolving that issue. We have those restores rolling out live now.

Status remaining
Degraded Performance
2018-10-23T01:45:38.360197+00:00

We're now going through the server, site by site & will be logging any initial issues we see. Those are will be handed over to a sysadmin to help troubleshoot anything additional that is happening with this cluster. This cluster serves 215 sites & approx. 25% of that number were affected.

Status remaining
Major Outage
2018-10-22T23:30:23.920034+00:00

Continuing to check for corrupted files.

Status remaining
Major Outage
2018-10-22T23:05:21.971944+00:00

Sites are coming back online, file checks are underway, checking for corrupted files.

Status remaining
Major Outage
2018-10-22T22:38:16.126983+00:00

We're seeing sites come back online, awaiting feedback from sysadmins on the file system health & next steps so that we can provide you with more details on what to expect.

Status remaining
Major Outage
2018-10-22T22:16:33.716482+00:00

The investigation has shown that the issue could be caused by a file system error, as the cluster is only restarting in read only mode.

We're now attempting to perform a file system check & repair, that process can take time. Updates will be made here.

We also have plans in place if necessary to start failover procedures to another data centre failover server.

Status remaining
Major Outage
2018-10-22T21:55:20.638051+00:00

We have data centre staff actively investigating, we should have an eta very soon.

Status remaining
Major Outage
2018-10-22T21:30:15.409875+00:00

No ETA at this point, they are still investigating at the data centre.

Status remaining
Major Outage
2018-10-22T21:08:40.950250+00:00

We're currently experiencing what appears to be a network issue on the canada300 cluster. We're investigating, more details here shortly.

Status changed to
Major Outage
2018-10-22T20:48:22.385877+00:00
Back to Current Status