We are a computational biology lab that develops novel methods for analyzing DNA and RNA sequences, and that analyzes genomes to make biomedical discoveries. Our research includes software for aligning and assembling genomes, gene and transcriptome analysis, and microbiome analysis. We work closely with biomedical scientists to apply our methods to current problems arising in a broad spectrum of biological and medical research areas. We’re part of the Center for Computational Biology, a group of 20+ faculty members and their labs at Hopkins working on computational, statistical, and mathematical methods that can turn massive genomic data sets into biologically and clinically useful information.
Research Project Areas. Our lab currently works in four related but distinct areas:
- Computational gene finding and genome annotation. We have been working for many years on methods to identify genes, ranging from methods for bacterial and human gene finding to the development of a new human gene database, called CHESS. You can read more about the CHESS project here. We also use AlphaFold2 and ColabFold to identify functional gene variants. We are also part of the T2T project that sequenced and published the first complete human genome, and we are currently working with the consortium to create comprehensive annotation of multiple humans.
- Transcriptome (RNA sequencing) analysis. Over the past 15 years, members of the lab along with our collaborators have developed multiple programs for RNA-seq analysis that have been adopted around the world. These include the Bowtie, TopHat, and Cufflinks programs, and more recently the HISAT and StringTie programs, with over 100,000 citations collectively. Together these programs align and assemble RNA sequencing data to reconstruct a detailed picture of all the genes and gene variants that are expressed in a tissue sample. The Bowtie project is led by CS Prof. Ben Langmead and the StringTie project is led by BME Prof. Mihaela Pertea.
- Metagenomics and microbiome analysis. We have developed a variety of tools to analyze metagenomics data sets, including the widely-used Kraken, KrakenUniq, and Centrifuge systems. We have a special focus on using metagenomic sequencing to diagnose infections in human patients. Here’s a paper that describes one of our early efforts to use direct DNA sequencing to diagnose brain infections. We have also focused our efforts on re-evaluating recent claims of a cancer microbiome, some of which have turned out to be incorrect.
- Genome Assembly. We have developed genome assembly tools to use the latest generation of sequencing technologies, pushing the technology to take on ever-larger and more complex genomes, such as our recent projects assembling the genomes of the redwood and sequoia trees. We apply these methods in collaborations with biologists to sequence the genomes of species ranging from bacteria to plants and animals. See our Genome Projects page for a partial list of the many genomes we have assembled and published over the years.
For a broader look at our software, see our software page or the the CCB website.
Interested in joining the lab? The lab will be admitting students for the fall semester in 2026. For information about applying as a graduate student, please see this page. We do not admit students directly; we admit them through the Biomedical Engineering, Computer Science, and Biology Ph.D. programs, and you can apply to any of those. Be sure to mention your interest in this lab on your application. If you’re interested in a postdoctoral position, please write to Prof. Salzberg directly
Looking for a summer intern position? We take a small number of undergraduate summer interns each year through the BDP HOUR program. To learn more, visit the HOUR website. (Note: the HOUR website will start taking applications in early 2026.)
The Salzberg lab is supported in part by the NIH under grants R35-GM130151, R01-HG006677, and R01-MH123567, and in the past by multiple NSF grants.