Skip to content

Methodological Research

Methods for genetic association testing (Project leader: CY Chiu, S Sen; contributors: H Kim; funding: NIGMS)
Testing for genotype-phenotype associations in large-scale genetic studies is a key step toward discovering biological mechanisms underlying disease.  Drs. Chiu and Sen are taking complementary statistical approaches in human and non-human populations respectively.  Dr. Chiu is investigating the use of functional data analysis (FDA) techniques to model the effects of multiple genetic variants in gene level association testing.  This allows information compression from large number of genetic variants, fully utilizing linkage disequilibrium (LD) and genetic physical position information.  Dr. Sen and his group are developing algorithms to extend linear mixed models (LMMs) to multivariate phenotype-genotype association testing. Their approach enables genetic researchers to consider both phenotypic and genotypic correlations when testing for genetic association. Linear mixed model code: https://bitbucket.org/linen/FaSTLMM

Statistical computing for large omic datasets (Project leader: S Sen; contributors: X Hu, H Kim, G Farage; funding: NIGMS, NIDA)
Statistical analysis of large omic datasets (such as those obtained from microarrays, mRNA sequencing, and mass spectrometry to study the transcriptome, proteome, microbiome, metabolome, and other "omes") present computational challenges.  A major issue is that algorithm prototyping is done in a high-level language such as R or Python, but some elements have to be coded in a low-level language such as C/C++ for speed.  Our approach is to use the Julia programming language for prototype development focusing on estimation of multivariate linear mixed models for large-scale and high-dimensional data. In our initial work Julia has speed comparable to C++.  We are also using Julia's interface to GPU (graphical processing unit) computing to speed computations as many computations with high-dimensional datasets may be sped up exploiting parallelisms suitable for GPUs.
The web interface to these computational modules will be put on Gene Network [https://gn2.genenetwork.org].

Tools for enhancing statistical collaborations (Project leader: F Thomas, contributor: T Hayes)
This project develops tools that allow for better statistical analyses by shifting labor (time) from the data reading and processing phase to the phase of statistical modeling and conclusions.  The developed tools also lead to better science by facilitating reproducibility of the statistical results, because the source code for the documents contains all computational steps in the same sequence as actually executed in the analysis.  The project harnesses recent developments that allow creation of dynamic documents that weave text and statistical computations.  Project software is available at https://github.com/FrThomas/risyphus.

Statistical analysis of activity data (Faculty: Z Bursac, M Kocak, R Krukowski, S Sen, F Thomas; contributor: G Farage; funding: NIDDK)
Accelerometer-based wearable devices are widely used for assessing physical activity. Recorded accelerometer data provide information about the intensity, frequency and duration of physical activity. We are using physical activity data from wearable tracker devices of women in the military during pregnancy and the postpartum phase. Our research is aimed at developing methodology to characterize activity patterns and determine correlations between physical activity and weight gain or weight loss during and after pregnancy.  More generally, we will develop a framework for associating activity patterns with health outcomes, accompanied by a software implementation.

Last Published: Jan 24, 2019