BioTeam has been using and deploying ParallelCluster since the early “CfnCluster” days. In this webinar, Chris Dagdigian, BioTeam’s Co-Founder and Senior Technical Director of Infrastructure, candidly discussed how BioTeam uses AWS ParallelCluster to help our clients bring flexible, auto-scaling traditional HPC capabilities into the cloud. Specific attention was given to providing actionable, usable tips and advice including our most common post-install ParallelCluster customizations.
Cloud evangelists have a tendency to pretend that legacy use cases do not exist and that all usage of the cloud involves shiny new cloud-native design patterns built from scratch. This misconception is particularly damaging in the life sciences world where we have hundreds if not thousands of scripts, tools, applications, and workflows that will never be rewritten or redesigned for object native data stores or serverless design patterns.
The data-intensive life sciences informatics community has grown up around large-scale HPC environments running batch schedulers like Slurm and POSIX-based “files and folders” shared storage. Our user base (senior researchers & early career recruits alike) have expertise and often significant investment in pipeline and workflow automation developed for HPC platforms and batch schedulers.
Organizations extending, migrating or expanding into the cloud need to bring “traditional HPC” capabilities into the cloud, if only to sustain a long transition period to “what comes next”. Fortunately, AWS has a fantastic solution for this–AWS ParallelCluster, an open source and free framework that combines “traditional HPC” capabilities with cloud features such as rapid hardware reprovisioning and cluster auto-scaling.
Learn more about Dagdigian’s next webinar here.