Service Outage Summary

Last Wednesday, 05-Aug-2020, there was an extended unscheduled outage of central CLASSE-IT services from approximately 10:30 AM to 4:00 PM. The outage was triggered when a server in the CLASSE infrastructure cluster, which had been shut down for the scheduled power outage on 25-Jul-2020, was repaired and powered on. Such operations are normally routine and transparent, but in this case, it exposed a bug in the underlying operating system that destabilized the cluster to the point of collapse and made a clean recovery impossible.

To reduce the likelihood of similar occurrences in the future, we have identified additional preventative measures to be taken in case of server failure, and we will be scheduling an operating system upgrade to Scientific Linux 7.8, which contains a fix for the bug mentioned above. We apologize for the disruption and loss of productivity caused by this unanticipated outage.

General network and server maintenance will occur every Tuesday from 12:00 noon to 2:00 PM. The CLASSE-IT group will always announce any expected disruptions in our NewsLetter and via CLASSE-IT-NEWS-L, but with the size and complexity of our network there is always the potential for something to go wrong. We will do our best to contain all network maintenance and planned outages to Tuesdays from 12:00 noon to 2:00 PM.

Unless other arrangements have been made, CLASSE-managed Windows systems may be updated and rebooted on Tuesday morning at 2:00 AM, so please avoid critical or lengthy operations at that time. For more details, please see SystemExpectations.

Questions or problems? Submit a service request.

Other resources:

Topic revision: r4 - 07 Aug 2020, WilliamBrangan
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding CLASSE Wiki? Send feedback