What do AI research projects involving tailoring treatment for Alzheimer’s patients, studying birdsong to understand how brain functions relate to motor activity and detecting emotions from social media feeds have in common? They all rely on vast amounts of data being efficiently transferred and processed with powerful resources.
Anticipating a need for advanced computing infrastructure to support a growing research community, the AI.Humanity Infrastructure Advisory Committee was convened in spring 2023 to guide the Office of the Provost and the Office of Information Technology on appropriate IT infrastructure and other types of useful resources.
The interdisciplinary committee, led by co-chairs Lanny Liebeskind, senior vice provost for academic affairs, and John Ellis, interim enterprise chief information officer and senior vice provost for information technology, has endorsed a comprehensive plan to increase scientific computing capabilities to accommodate the influx of data-intensive projects related to AI research. This includes implementing a novel high-performance cloud computing cluster, developing a high-speed data transfer and campus research network and exploring on-premises high-performance computing at the Emory Data Center.
“Our immediate focus was to support the scholarship of those who work with large amounts of data and require high-performance computation power. We needed a solution that offered researchers, educators and students a user-friendly, no-cost access point to shared high-performance computing resources that was easily scalable and built on Emory’s existing computing structure and partnerships,” said Ellis.
New high-performance computing options in the cloud
As an evolution from a history of disparate on-premises infrastructure developments to an expansion of Emory’s cloud strategy and partnerships, the Hybrid High-Performance Computing Platform for Education and Research (HyPER) Community Cloud Cluster (C3) was introduced in fall 2023.
This innovative cloud-based cluster provides shared, centrally managed ready access to high-performance computing resources housed in AWS at Emory – a proven, customized cloud environment serving roughly 1,000 users since 2019, which integrates with Emory’s security, technology and financial infrastructure.
HyPER C3 provides an expandable selection of general-purpose central processing units (CPUs), graphics processing units (GPUs) as well as high-memory nodes and advanced NVIDIA A100 and V100 GPUs. Notably, HyPER C3 replicates the user experience that computational scientists are familiar with when interacting with compute environments, through the popular SLURM cluster management interface, with built-in fair share and cost control guardrails. Users do not need to administrate the cluster equipment or learn cloud engineering skills to run their data processing on HyPER C3.
“HyPER C3 offers a HIPAA-compliant, scalable, shared infrastructure for research computing to boost Emory as a research-intensive institution. It’s already a familiar interface to researchers and the cloud provides the flexibility of accessing the latest GPUs and right sizing the cluster based on true utilization needs," said Ellis.
Ahead of its launch, several research groups from across Emory participated in a test phase for HyPER C3. Overall, the early adopters had positive feedback about the new cloud cluster and were able to advance impactful research that had stalled due to the infrastructure limitations of on-premises .
Judy Gichoya, associate professor in the Department of Radiology and Imaging Sciences at Emory School of Medicine, was one of the researchers who tested HyPER C3. Her work involved enhancing laryngoscopy images produced by low-quality medical equipment. The AI model was trained with 21,000 individual images. Before the cloud solution, this model would have required several on-premises GPUs and still been extremely slow.
“Before the pilot, we attempted to scale the super resolution training on our on-premises servers with 16 GPUs and 48 GB of RAM; however, the training continued to be slow. With HyPER C3, we were able to train the models successfully across multiple architecture types. This project is serving as a foundation for edge deployment of mobile laryngoscopy in limited resource settings and has informed us on how we can scale our work in radiology AI,” said Gichoya.
Groups across Emory have already started onboarding HyPER C3 for computational AI projects spanning a diversity of scientific inquiries, from developmental disorders, cardiovascular diseases, cancer, and metagenomics to digital history, social media economics, algorithmic bias and fairness, conversational AI and more. The roadmap includes faster data transfer connections, special-purpose computing cluster extensions, ongoing user training and enhanced user interfaces to benefit students and faculty who are less familiar with scientific computing environments.
Coming soon: high-speed data highways and other on-premises improvements
In addition to the HyPER C3, the Infrastructure Advisory Committee has also consulted on an extension of Emory’s secure academic network that will support data intensive research efforts at Emory through very fast intra-campus communications and fast data transfers between Emory's campus, private cloud and external collaborating institutions. For example, the transfer of voluminous laboratory instrumentation data to a server in Emory’s Data Center or to a cloud storage area will no longer compete with Emory’s enterprise network traffic. This infrastructure, termed the Hybrid Scientific Cyberinfrastructure (HySCi), will be available in 2024 to support AI research, after a pilot phase.
In terms of upgrades to on-site computing resources, consultants recently completed an assessment of the university's Data Center to determine options for expanding on-premises high-performance compute (HPC) capacity. These on-site resources, in conjunction with cloud-based HPC clusters, would be used collaboratively by researchers across campus. Emory is also deploying an HPC service team to support faculty and students with these new resources.
“The investment Emory is making toward high-performance computing through cloud-based resources, high-speed data transfer networks and on-premises computing capacity underscores the overall importance of the AI.Humanity initiative to Emory’s mission. Lowering computing infrastructure barriers and supporting researchers with the tools they need will result in even more impactful work in the service to humanity,” said Liebeskind.
To learn more about computing infrastructure at Emory, visit the AI.Humanity website.
HyPER C3 is now available to faculty working with AI. To request access, visit the HyPER C3 website.
AI.Humanity Infrastructure Advisory Committee
AI infrastructure planning is led by a committee with multi-disciplinary faculty and IT staff representation.
Committee Co-Chairs
Committee Members
David Cutler
Tony Pan
Xiao Hu
Yana Bromberg
Zhaohui Qin
Dieter Jaeger
Monica Crubezy
Jo Guldi
Vaidy Sunderam
Judy Gichoya
Ramnath Chellappa
Rakesh Shiradkar
Yan Sun
Joe Sutherland
Tara Bartelt (Project Manager)
Kaye-Ann Sadler (Program Director)