Events:

2023-05-26 at 00:08 [dardel]
The PDC login portal (https://loginportal.pdc.kth.se/) is now again online.
2023-05-25 at 21:38 [dardel]
There are issues on behalf of making use of the PDC login portal. Consider it off-line for the time being.
2023-05-22 at 15:53 [dardel]
all nodes in the GPU partition has gotten a BIOS flash update, addressing a potential Escalation of Privilege situation. Jobs allowed to run again.
2023-05-19 at 18:10 [dardel]
the internal CEPH storage in use by most internal management services now restarted. Jobs starting since half an hour. Several jobs still running since prior the outage.

Some jobs likely were experiencing issues during outage.

(GPUs still off, security audit pushed forward in time.)

2023-05-19 at 11:36 [dardel]
VM hosting slurm master daemon, and other services on other VMs needed to manage the system still have issues. Fault search is ongoing.
2023-05-19 at 08:24 [dardel]
the VM hosting the slurm master daemon seem stopped since 05:12 this morning. No job state changes possible since then. Investigation / restart in progress.
2023-05-18 at 23:38 [dardel]
no new job starts allowed on GPU partition while investigating a potential Escalation of Privilege situation on GPUs.
2023-05-13 at 06:27 [dardel]
the primary login has been redirected to another access node since several hours. Please report eventual anomalies.
2023-05-12 at 23:02 [dardel]
the primary login is still having issues. Work is on-going. Jobs already running or waiting in queue so far un-affected.
2023-05-12 at 21:06 [dardel]
the primary login node is unresponsive and is being rebooted.
All flash news for 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995

Back to PDC
Subscribe to rss