Skip to main content

Dardel’s Lustre system will be updated during the week of February 19th

Dardel will be down for the entire week

Published Feb 08, 2024

Next Dardel update

Starting at 13.00 on Monday the 19th of February, Dardel will be down to update the software on the Lustre disk system to a new major version. We will also move the disk system to its final position. This downtime is likely to take the entire week.

We have worked hard to minimise the length and number of downtimes required to get Dardel to a state that is more straightforward to maintain and with up-to-date software before the summer.

This downtime in February should be the only planned downtime required before the “final” update, which will start on the 20th of May and last for a week.

Dardel was updated to a newer software stack which led to severe Lustre problems

Dardel was switched to a new software stack (called Raspberry) on Wednesday the 31st of January. Although PDC has been testing Raspberry on a separate partition for some time, the new stack caused some problems with the Lustre disk system. This means applications that are I/O heavy could make the Lustre file system extremely slow for some time or even crash the application due to a bug in the Lustre client. The VASP application was especially affected by the Lustre bug. We have identified this to be the VASP application using I/O in a way that triggers this bug. The fix for the VASP application was relatively straightforward and is now implemented in most VASP versions. Please read the section below for information on how to use the updated VASP applications. Of course, we are also working with HPE to implement a more general solution to the Lustre bug.

To use the updated Dardel system

  • Log in to the dardel.pdc.kth.se login node in the usual way.
  • PDC has updated some of the application software. Such software can be used by loading the module PDC/23.03:
    ml PDC/23.03
      Also, use PDC/23.03 to compile software.
  • To find the module where software XYZ is located, the command
    module spider XYZ
      can be used. Also, the software in the module PDC/22.06 should work (but it is better to use PDC/23.03 if the package that you want to use is available in this module).

To use the GPU nodes

  • Log in as above.
  • The default version of the AMD GPU Stack ROCm is now 5.0.2. Version 5.3.3 has also been installed but is not supported by HPE under Raspberry. Advanced users can try 5.3.3 at their own risk.

To use the updated VASP applications

PDC has updated most of the VASP applications. Such software is located in the module PDC/23.03 and can be used by the following commands:

ml PDC/23.03 #Load the PDC/23.03 module

module av vasp #List all (updated) VASP modules in PDC/23.03

ml vasp/6.3.2-vanilla #Load one of the VASP modules