Education, HPC, and the Cloud

Our final part of the series explores the lessons learned from building up the nUCLeus cloud cluster environment for education in HPC— and what that cluster is up to today.

Alces Flight
4 min readMar 3, 2021

Reflections on an education cluster environment built on public cloud.

The CompBioMed Center of Excellence focuses on turning the science fiction of personalised medicine into the science fact of everyday clinical use. With partnerships ranging across a host of institutions, access to some of the most powerful supercomputers in the UK/EMEA, and a field growing rapidly into High Performance Computing, the team turned to public cloud to consolidate and distribute HPC knowledge to their user base.

Starting with establishing broad education goals and then using those goals to influence their cluster build, the CompBioMed team spent eight weeks training incoming medical researchers on the foundations of supercomputing. Spread across two institutions (UCL and the University of Sheffield) as well as the offices and homes of supporting faculty and students, the team diligently worked to ensure students were engaged and had the support they needed to succeed.

The first cohort of nUCLeus was a great success thanks to the people who founded and ran the project as well as the technologies that underpin it. Here’s everything we learned, including what you can read and watch now, on how nUCLeus came together for CompBioMed.

Three lessons and the outcomes.

Even after five years of experience in working with public cloud the team at Alces Flight always finds new lessons and revises what we’ve learned before. nUCLeus was no different. Our top three take-aways from our time working on the cluster environment launch centred on data management, service management, and how focused work can pay off in the long run.

  1. Prioritise the data. Before commencing any public cloud project there are three things you need to focus on with data: Know where it is, know how clean it is, and establish the lifecycle it (and its output) should take. Creating a system around your data will not only make it easier and more cost-effective for you to run your project, it will also help you plot what will happen with the data once the project comes to an end.
  2. Manage the project. The CompBioMed team chose to work with Alces Flight in order to overlay services and tools that would cut down their time and cost to build and manage a cloud cluster environment for HPC education. From centralising the teams knowledge and strengths in Alces Flight Center to utilising our toolsets for testing and production the CompBioMed team now have a system to add on additional coursework and focus on their goal to up-level research into personalised medicine.
  3. Set the project lifecycle in motion. The initial project effort in establishing the rules around how education should work on the cloud and the technology and process that surround it was no minor undertaking. However, all this work was done with the goal of being able to repeat and scale education initiatives across a large and continuously growing consortia. By laying a solid foundation, the CompBioMed team are now on their way to building up a knowledge base that can benefit all their members and take their work into the exascale age.

What is nUCLeus doing right now?

Currently, nUCLeus is in its second cohort training new users in HPC and refining models around compute usage and storage. Thanks to the work done the time to launch and run this course was half of the initial effort (2 months instead of 4), with benefits ranging from reduced cost and time expended to more time to plan for the future. With exascale on the horizon the education and training team at CompBioMed can now focus on how their cloud cluster environment can help onboard new users and research aimed for the changes in HPC to come.

Can I get more details on how nUCLeus works?

We are extremely proud of the hard work the CompBioMed team, as well as supporting faculty and staff from UCL and the University of Sheffield, put into making this cluster environment a success. Their work not only resulted in the publications and videos below, it has also allowed seven first-time student authors to be elevated in the field.

If you want to talk to the Alces Flight team about how you can utilise public cloud in HPC, get in touch. We appreciate you taking the time to read our series — and if you need to look back at the previous work here is is the links to Part 1 and Part 2. We look forward to bringing you more in 2021!

nUCLeus Publication links:

--

--

Alces Flight
Alces Flight

Written by Alces Flight

Software for research computing

No responses yet