My Software Research

Hello and welcome to my academic website. Here you can find information about my research, publications, software development, and other news. I (hope to) regularly write posts on topics related to ecology, evolution and bioinformatics with the goal of documenting fun stories and useful Python and R code. I’m currently a post-doc in the Donoghue Lab at Yale University. Before starting here I […]


pyRAD Software to assemble de novo RADseq loci from restriction-site associated sequence data (RAD,ddRAD,GBS,PEddRAD,PEGBS). Alignment clustering allows for indel variation to better align highly divergent samples, merge and trim methods canbe employed to rescue overlapping paired end data, and reverse complement clustering is available to improve GBS assemblies of short overlapping contigs. simrrls Software to simulate fastQ […]

Simulating raw RADseq data

The program simrrls can be used to simulate raw (fastq format) sequence data on an input topology under the coalescent in a manner that emulates restriction-site associated DNA, with slight variations for different data types (e.g., RADseq, ddRAD, GBS, paired-end data). RADseq was originally developed to generate data for bug testing pyrad and to create example RADseq […]


Description The benefit of pyRAD over most alternative methods for analyzing RADseq-like data comes in its use of an alignment-clustering method (vsearch) that allows for the inclusion of indel variation which improves identification of homology across highly divergent samples. For this reason pyRAD is commonly employed for RADseq studies at deeper phylogenetic scales, however, it works equally well at […]