AIBN Welcomes New High-throughput Computing Cluster

7 April 2020

            

UQ’s Research Computing Centre (RCC) has, through some clever repurposing of decommissioned cloud computer parts, created a powerful new computer called Delena, which is housed at the Australian Institute for Bioengineering and Nanotechnology (AIBN).

RCC has also developed a high-throughput computing and workflow system called Nimrod Portal to make it easier for UQ researchers to create and launch large-scale parameter sweep experiments using Nimrod, and perform simple data management tasks.

As Nimrod can co-schedule across multiple clusters, Delena provides a dedicated resource in addition to RCC’s other HPC clusters (Tinaroo, FlashLite, Wiener and Awoonga), and Nimrod can access all of these.

 “By moving these nodes to St Lucia to form a new computing resource for research, RCC is taking advantage of UQ’s solar farm and green power initiative and reusing infrastructure that would otherwise have been sent to eWaste,” said RCC Chief Technology Officer Jake Carroll.

Huntsman-spider-in-hand

Delena has been aptly named after a genus of South Pacific huntsman spiders. They are highly unusual among spiders due to their ‘cluster’ behaviour; they are a social species, even sharing prey. The new computer was created by reconfiguring decommissioned nodes from QCIF’s cloud computer, QRIScloud, and has more than 1,000 CPU cores, with a core-to-memory ratio of about 4 GB per core. It is best suited for users who can run a single job inside an 8 core, 32 GB RAM footprint.

Delena will be launched soon, and the Nimrod Portal is already operational. RCC is calling for HPC users to test the portal — you can login and try it now. Please note you need to have an Awoonga account to use it, and that RCC is still working on documentation and fixing a few bugs.

The portal was developed to lower the barrier to entry-level HPC use and large scale parameter sweep experiments. It will also ease the pressure on HPC schedulers compared to using HPC job array tasks**.

There are two main benefits of using Nimrod over job arrays:

  1. Nimrod can distribute jobs amongst HPC clusters (Tinaroo, FlashLite, Awoonga, Wiener and soon, Delena);
  2. Nimrod has built-in mechanisms to resume an experiment to where it was when a system crashed.

Nimrod itself is a specialised parametric modelling system. Parametric computational experiments are becoming increasingly important in science and engineering as a means of exploring the behaviour of complex systems. For example, an engineer may explore the behaviour of a wing by running a computational model of the airfoil multiple times, while varying key parameters such as angle of attack and air speed. The results of these multiple experiments yield a picture of how the wing will behave in different parts of parametric space. The same process can be applied in other experiments that involve parametric modelling. 

Nimrod provides the machinery to automate the task of formulating, running, monitoring, collating, presenting and visualising the results from multiple individual experiments. 

The Nimrod Portal will help researchers run computations remotely. It can turn your laptop into a supercomputer. With Nimrod you can run many jobs — millions if need be. 

Contact the RCC Support Desk if you experience any issues with the Nimrod Portal: rcc-support@uq.edu.au.

https://nimrod.rcc.uq.edu.au/
 

** A job array is a collection of similar independent jobs which are submitted together. The advantage of using a job array is that many similar jobs can be submitted using a single job template script, and the jobs will run independently as they are able to obtain resources on the compute cluster.

Using a job array can be advantageous for calculation throughput, especially for small independent calculations that may be able to run "in-between" larger calculations. Job arrays are mostly easily used if input and output files for independent calculations can be numbered in a sequential fashion. (Information from the Minnesota Supercomputing Institute.)

Latest