Research Computing Infrastructure

Over the past twenty-five years, the Department of Biostatistics and Medical Informatics has developed a centralized state-of-the-art computing facility for the support of statistical and medical informatics research, and for the management and analysis of clinical, genomic and other biological data.

The computational resources in the Department of Biostatistics and Medical Informatics include a network of over 200 multi-core, 64-bit Linux servers (totaling more than 1750 cores). These nodes are optimized for large scale high memory (Random Access Memory or RAM ) footprint jobs with total cluster RAM exceeding 10.8TB. Over 10% of the cluster is comprised of high memory machines (machines with more the 128GB of RAM) with 4 servers containing 512GB of memory (RAM).The facility currently houses more than 500 terabytes of enterprise-grade, networked storage configured in a redundant, continuously backed-up, and remotely replicated setup. Most of these machines are made available for compute-bound tasks by a locally developed software system called HTCondor, a high throughput computing system which automatically locates cores that are idle and transfers jobs to them. The jobs are periodically check-pointed and migrate from machine to machine, as needed, until completion. Additionally, department members also have access to the campus Center for High Throughput Computing (CHTC), containing about 1900 64-bit Linux computational cores with 1.5-64 gigabytes of RAM per core. The HTCondor system provides excellent support for the extensive experimentation that is typical of machine-learning and computationally intensive statistical research.

A full complement of up-to-date software packages and tools are available and well supported.

The data centers are fully temperature-controlled and power conditioned and are connected to the Hospital’s emergency power. On a 24/7 basis, temperature, server functionality and security are also monitored by an automated systems that notify an on-call staff member if problems arise. Secure remote access via a dedicated out-of-band network generally makes remote repairs possible during off-hours. All supported computers are connected to the Medical School’s network and are behind its firewall.

Highly experienced, full-time IT staff in the Department of Biostatistics and Medical Informatics provide and support services such as network access, file backup and recovery, software installation and support, and maintenance of shared printers.

The research computing infrastructure of the Department of Biostatistics and Medical Informatics will be the primary, but by no means the only, computing resource dedicated to the needs of the department and supported groups.