CLASSE Computing Outages: 22-Jul-2020 to 27-Jul-2020

During the week of 20-Jul-2020, a series of scheduled power outages at Wilson Lab will disrupt CLASSE computing services both onsite and offsite. We expect to have most services restored by Monday, 27-Jul-2020, although some recovery efforts might extend farther into the week. Here is a summary timeline, along with recommended actions for users:

Day Time Action
Wed 22-Jul 10:00 AM Compute Farm capacity reduced, batch queues disabled
Fri 24-Jul Before 5:00 PM Please save your work and log out
Fri 24-Jul 5:00 PM Most CLASSE services shut down, including Compute Farm
Sat 25-Jul 6:00 AM CESR and CHESS control systems shut down
Sat 25-Jul Late night CESR and CHESS control systems restored
Mon 27-Jul Early morning Most CLASSE services restored; reboot your computer if necessary

Please see below for more details. Also, you may subscribe to CLASSE-IT-NEWS-L to receive email announcements during these outages.

Impact on Compute Farm

In preparation for the Wilson Lab East Module power outage on Friday, 24-Jul-2020, most CLASSE Compute Farm batch queues will be disabled two days beforehand, starting Wednesday, 22-Jul-2020 at 10:00 AM. Please note that interactive and CHESS queues are excluded from this two-day pre-outage pause; jobs running in these queues will instead be terminated early morning on Friday, 24-Jul-2020, when the East Module power is cut.

Any terminated jobs can be resubmitted immediately for continued processing during business hours on Friday, 24-Jul-2020. The Compute Farm will remain operational on Friday thanks to a new farm node that has been deployed in Wilson 221. However, farm capacity will be severely degraded, so you may need to wait for resources to become available before your job is scheduled.

All remaining jobs will be terminated on Friday, 24-Jul-2020 at 5:00 PM, when most CLASSE computing services will be shut down ahead of the Wilson Lab main building power outage on Saturday, 25-Jul-2020. We expect the Compute Farm to be restored to full capacity by Monday, 27-Jul-2020.

Impact on General Computing

In preparation for the Wilson Lab main building power outage on Saturday, 25-Jul-2020, most CLASSE-IT infrastructure will be shut down the evening before, on Friday, 24-Jul-2020 at 5:00 PM. Please save your work and log out of your computer before this time.

During the Wilson Lab main building power outage, the following CLASSE-IT services will be unreachable from anywhere onsite or offsite:
  • Remote logins via ScreenConnect, X2Go, ssh, etc.
  • CLASSE VPN and other CLASSE networks
  • Central filesystems, including Samba and Globus
  • Web services: wiki, Indico, Timesheet, CLASSE / CHESS / CBB websites
  • Compute Farm
  • Printing
Also, within Wilson Lab, Cornell wi-fi will be unavailable while power is down. CLASSE-IT will begin the recovery process Saturday evening immediately after the power outage, but services will not be fully restored until Monday, 27-Jul-2020. If you have any questions or concerns, please contact us via a ServiceRequest.

CESR and CHESS Control Systems

In order to minimize downtime for the CESR and CHESS control systems, we will not shut them down until right before the Wilson Lab main building power outage, on Saturday, 25-Jul-2020 at 6:00 AM. They will also be the first systems to be restored after power returns Saturday evening.

After the Outage

Although we expect most central CLASSE-IT services to be restored by Monday, 27-Jul-2020, individual end-user computers might need additional attention. If you have problems with your workstation or computing resources on Monday, please:
  • First, try rebooting your computer, if possible.
  • Then, submit a ServiceRequest if the problem remains.

General network and server maintenance will occur every Tuesday from 12:00 noon to 2:00 PM. The CLASSE-IT group will always announce any expected disruptions in our NewsLetter and via CLASSE-IT-NEWS-L, but with the size and complexity of our network there is always the potential for something to go wrong. We will do our best to contain all network maintenance and planned outages to Tuesdays from 12:00 noon to 2:00 PM.

Unless other arrangements have been made, CLASSE-managed Windows systems may be updated and rebooted on Tuesday morning at 2:00 AM, so please avoid critical or lengthy operations at that time. For more details, please see SystemExpectations.

Questions or problems? Submit a service request.

Other resources:

Topic revision: r7 - 04 Apr 2022, AdminDevinBougie
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding CLASSE Wiki? Send feedback