Genotype Phenotype Association Toolkit  

The application of population genomics to non-model organisms is greatly facilitated by the low cost of next generation sequencing (NGS). Barriers, however, exist for using NGS data for population level analyses. Traditional population genetic metrics, such as Fst, are not robust to the genotyping errors inherent in noisy NGS data. Additionally, many older software tools were never designed to handle the volume of data produced by NGS pipelines. To overcome these limitations we have developed a flexible software library designed specifically for large and noisy NGS datasets. The Genotype Phenotype Association Toolkit (GPAT) implements both traditional and novel population genetic methods in a single user-friendly framework. GPAT consists of a suite of compiled tools and a Perl API that programmers can use to develop new applications. To date GPAT has been used successfully to identity genotype-phenotype associations in several real-world datasets including: domestic pigeons, Pox virus and pine rust fungus. GPAT is open source and freely available for academic use.

GPA++ is a C++ extension of The Genotype Phenotype Association Toolkit. The perl implementation of GPA has more bells and whistles than GPA++, but lacks speed.


Shapiro MD, Kronenberg Z, Li C, Domyan ET, Pan H, Campbell M, Tan H, Huff CD, Hu H, Vickrey AI, Nielsen SCA, Stringham SA, Hu H, Willerslev E, Gilbert MTP, Yandell M, Zhang G, Wang J.
Science. 2013 Mar 1;339(6123):1063-7


The University of Utah freely licenses GPAT for academic research use. For commercial use please contact Mark Yandell.

Community Resources