RSS-Feed abonnieren

Passwort vergessen?

v Anmelden

13. Nov 2017

IANVS Maintenance 2017-11-15 13:00

Verfasst von

Dear IANVS users,

IANVS will go down for maintenance on Nov 15th, 2017, 1:00 P.M. Barring unforeseen circumstances, the cluster will be back online by noon, Nov 16th.

We plan to implement the following changes:

/scratch/user Cleanup

The /scratch/user directory will be cleaned up in preparation of per-research-group folders. As a consequence, data residing directly in /scratch/user (that is, data that is not in any of the subdirectories /scratch/user/<user> and does not belong to a project) will be moved to the read-only subdirectory /scratch/user/cleanup. The cleanup directory will then be deleted on January 8th, 2018. If you have any vital data in /scratch/user/cleanup, we ask you to copy it to your home or scratch directory prior to that date.

Hyperthreading

In order to give our users more flexibility regarding application performance, we will enable SMT (simultaneous multithreading or „Hyperthreading“) for all nodes in the „small“, „large“, and „gpu“ partitions. This effectively doubles the number of available CPU threads on those nodes. Please note, however, that running an application on multiple SMT threads may or may not result in a net performance increase: Some of the resources are still shared between a core’s threads.

We have therefore decided to keep our current policy of assigning a single task per core by default. In order to make use of the additional SMT threads, you will have to explicitly opt-in by adding the option „–hint=multithread“ or „–ntasks-per-core=2“ to your sbatch script. We will document the changes accordingly.

SLURM 17.02.10

SLURM, the scheduler, will be upgraded to its most recently released version, with the following changes:

  • Improved performance during SLURM traffic bursts
  • Add missing resource limit check when submitting to multiple partitions
  • Fix jobstep termination failure

Device Restrictions

Currently, jobs are resource limited on the CPU core axis and the memory axis: A job is not allowed to use any CPU core outside of the allocation it has been granted by the scheduler, and it is not allowed to use more memory than initially requested. However, for the nodes in the „gpu“ partition, it was still possible to get access to other jobs‘ GPU resources. In the future, jobs will only have access to the GPUs they have requested at submission time.

Checkpointing

The BLCR checkpointing feature is intended for applications which do not supply their own checkpointing facilities. In theory, this should allow for longer overall job runtimes without impacting the scheduler or interfering with maintenance windows — albeit with caveats, which we will document as soon as the feature goes live.

Domain Name Change

The IANVS nodes will move from the domain „hpc-service.itz.xd.uni-halle.de“ to „ianvs.xd.uni-halle.de“. The names „ianvs.itz“, „ianvs1.itz“, and „ianvs2.itz“ will be updated accordingly and will point to the new hostnames.

Please contact us if you have any questions: via e-mail or phone, (0345) 55-21864 or (0345) 55-21861.

Best regards,
ITZ HPC team

Über Patrice Peterson

Kommentieren


Seiten