You are here

Spring 2017 Louisiana ASA Meeting - Print Version

Schedule

Time Title and Speaker
9:30-10:00 Reception
10:00-11:00 Keynote Address
Distributions Associated with Simultaneous Multiple Hypothesis Testing
Daniel Zelterman
Division of Biostatistics
Yale University
New Haven, Connecticut
11:00-11:45 Multiple Comparison Issues in Analysis of Glyphosate Bioassay Data
Kenny Crump
Ruston, Louisiana
11:45-1:15 Lunch
1:15-2:00 An Improved Meta-analysis for Analyzing Cylindrical-type Time Series Data with Applications to Forecasting Problem in Environmental Study
Sungsu Kim
University of Louisiana at Lafayette
Lafayette, Louisiana
2:00-2:20 Estimation of Crowd Size in Lafayette, Louisiana at King's Parade during Mardi Gras 2017
Marina Ledet, Ngan Hoang Nguyen Thuy, Phontita Thiuthad, and Suntaree Unhapipat (advisor Nabendu Pal)
University of Louisiana at Lafayette
Lafayette, Louisiana
2:20-2:40 A Bayesian Sequential Design with Adaptive Randomization for Two-sided Hypothesis Tests
Lin Zhu, Han Zhu (advisor Qingzhao Yu)
School of Public Health
Louisiana State University Health Science Center
New Orleans, Louisiana
2:40-3:00 Break
3:00-3:20 Use of a novel statistical method for identifying interactions of SNP pairs associated with prostate cancer aggressiveness in African Americans
Heng-Yuan Tung (advisor Hui-Yi Lin)
School of Public Health
Louisiana State University Health Science Center
New Orleans, Louisiana
3:20-3:40 Online News Popularity: Trend Analysis
Krunal Khatri
Louisiana State University
Baton Rouge, Louisiana
3:40-4:00 Using Inverse Prediction to Determine if Lab Grown Maggots Reflect Wild Maggots' Growth Curves
Christie Watters (advisor Lynn LaMotte)
School of Public Health
Louisiana State University Health Science Center
New Orleans, Louisiana
4:00-4:20 The monthly conservice bill at Campus Crossings Apartment, Lafayette, Louisiana, USA
Daniella Tran, Suntaree Unhapipat (advisor Sungsu Kim)
University of Louisiana at Lafayette
Lafayette, Louisiana

Abstracts

Keynote Address

Distributions Associated with Simultaneous Multiple Hypothesis Testing
Daniel Zelterman
Division of Biostatistics
Yale University
New Haven, Connecticut

We develop a distribution to describe the number of hypotheses found to be statistically significant using the rule from Benjamini and Hochberg (1995) for controlling the false discovery rate (FDR). This distribution has both a small sample form and an asymptotic expression for testing many independent hypotheses simultaneously. We describe a distribution to approximate the marginal distribution of p-values under the alternative hypothesis.  This distribution is useful when there are many different alternative hypotheses and these are not individually well understood. We use the approximating distribution to estimate the fraction of p-values sampled under the null and multiple alternative hypotheses in two numerical examples.

Multiple Comparison Issues in Analysis of Glyphosate Bioassay Data
Kenny Crump
Ruston, Louisiana

Glyphosate (e.g., Roundup) is the most widely-used herbicide worldwide. In 2015 the International Agency for Research on Cancer (IARC) concluded that "There is limited evidence in humans for the carcinogenicity of glyphosate. … There is sufficient evidence in experimental animals for the carcinogenicity of glyphosate. … Overall, Glyphosate is probably carcinogenic to humans." Later both the European Food Safety Authority (EFSA) and the Joint Food and Agriculture Organization (FAO)/WHO concluded that glyphosate was unlikely to pose a carcinogenic hazard to humans. EPA released a draft Glyphosate Issue Paper in September 2016 which concluded that glyphosate is "not likely to be carcinogenic to humans" at human doses.
The glyphosate bioassay data are very extensive; they encompass 15 bioassays that each employed a control group and three or four treated groups of males and females of rats or mice. In each sex-specific group statistical tests can be performed on numerous tumor types. With so many statistical tests, some are expected to be significant (e.g., p less than 0.05) even if treatment has no effect on tumors. This talk will discuss this multiple comparison problem. An exact statistical test for a carcinogenic effect at any tumor site that can be used to address this problem will be illustrated by applying it to (non-glyphosate) bioassay data from the National Toxicology Program.

An Improved Meta-analysis for Analyzing Cylindrical-type Time Series Data with Applications to Forecasting Problem in Environmental Study
Sungsu Kim
University of Louisiana at Lafayette
Lafayette, Louisiana

We propose an improved GLS meta-analysis in a linear-circular regression, and show its utility in the analysis of a certain environmental issue. The existing GLS meta-analysis proposed in Becky and Wu (2008) has a serious flaw since information about the covariance among coefficients across studies is not utilized. In our proposed meta-analysis, we take the correlations between adjacent studies into account, and improve the existing GLS meta-analysis. We provide numerical examples to compare the proposed method with several other existing methods by using AIC, BIC and mean square prediction errors with applications to forecasting problem in Environmental study.

Student Project Competition Presentations

Estimation of Crowd Size in Lafayette, Louisiana at King's Parade during Mardi Gras 2017
Marina Ledet, Ngan Hoang Nguyen Thuy, Phontita Thiuthad, and Suntaree Unhapipat (advisor Nabendu Pal)
University of Louisiana at Lafayette
Lafayette, Louisiana

Mardi Gras is a major cultural event, not only in South Louisiana but also in many parts of the US Gulf Coast region. The event is characterized by colorful parades witnessed by thousands of spectators, especially children accompanied by adults. In Lafayette, Louisiana, where Mardi Gras is celebrated over several days, one of the biggest attractions is the King's Parade which attracts a large number of spectators. Keeping public safety in mind for such a large gathering, it is important for the local administrators to estimate the crowd size, which may help not only in managing the crowd with proper utilization of resources but also for understanding the challenges in case of an emergency. This project of ours provides a comprehensive statistical methodology, from data collection to crowd size estimation, in a step by step manner. Observations were collected at locations chosen through systematic sampling along the entire route of the parade. Point estimates as well as interval estimates of the crowd size on each side of the route have been obtained using the Central Limit Theorem (CLT) as well as the non-parametric bootstrap method (NBM). Interestingly, both the CLT and NBM interval estimates came out to be strikingly close, perhaps buttressing the reliability of our results.

A Bayesian Sequential Design with Adaptive Randomization for Two-sided Hypothesis Tests
Lin Zhu, Han Zhu (advisor Qingzhao Yu)
School of Public Health
Louisiana State University Health Science Center
New Orleans, Louisiana

Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size.

Use of a novel statistical method for identifying interactions of SNP pairs associated with prostate cancer aggressiveness in African Americans
Heng-Yuan Tung (advisor Hui-Yi Lin)
Louisiana State University Health Science Center
New Orleans, Louisiana

Genetic variants identified in genome-wide association studies (GWAS) only took a small proportion of estimated heritability. SNP/SNP interaction is one potential solution of the missing heritability. There are markedly different survival rates and prognosis profile for prostate cancer patients. Compared with prostate cancer patients with European ancestry, those with African ancestry tend to have a higher prostate cancer risk and prognosis.
We applied one novel tool "SIPI" trying to evaluate SNP/SNP interactions associated with the prostate cancer aggressiveness in African American. Data was from MEC study. Total 1933 prostate cancer patients and 5064 SNPs in the four pathways of interest will be analyzed. SIPI is testing 45 biologically meaningful interaction models for each SNP pair.
Over 12 million SNP/SNP pairs, we found two promising SNP pairs are associated with prostate cancer aggressiveness. First pair "rs3789889/rs7358800" with p-value = 9.32×10-9. Second pair "rs3789889/rs11009251" with p-value = 1.03×10-8. Prevalence and odd ratio also provide coherent result with SIPI. Odds ratio(95% CI) are 2.08(1.61, 2.70) and 1.84(1.49, 2.28) respectively. One of the SNP rs3789889 catch our attention, since it shows up 5 times in the 10 top pairs. Its gene SYK also been report recently, that it might involve the gene-regulation network. Our study findings may be beneficial for high risk group identification, which can provide valuable information for precision medicine.

Online News Popularity: Trend Analysis
Krunal Khatri
Louisiana State University
Baton Rouge, Louisiana

Due to cheap computing, availability of supervised learning methods is widely available. These methods are like a black box and it is hard to interpret any results one gets from them. Classic methods like Linear Regression in all its forms is still the best for interpretation purposes. Also, the science and art involved in studying available variables is important to avoid overwhelming results of analysis. Thus, I have intuitively done variable selection in addition to linear-correlation analysis and have tried to get interpretable results.
Using the Online News Popularity dataset from the UCI repository, I have fitted linear regression model, applied variable selection via AIC, applied repeated 10-fold cross validation and hyper-parameter search methods for elastic-net regression, to get a better interpretable model that best fits the dataset.

Using Inverse Prediction to Determine if Lab Grown Maggots Reflect Wild Maggots' Growth Curves
Christie Watters (advisor Lynn LaMotte)
LSUHSC School of Public Health
New Orleans, Louisiana

Labs grow multiple generations of maggots to use for studies whose conclusions are applied to maggots in the wild. One observation suggests that the controlled and constant conditions that lab bred maggots live in might effect their overall growth rate, especially after multiple generations of maggots have been kept in this living environment. Therefore, an inverse prediction method was used to test the validity of using lab grown maggots to model growth of wild maggots. Using linear interpolation, models estimating length as a function of age were built for multiple generations of maggots. Lengths for known ages of maggots were then tested to see at what ages that length would be an outlier. This analysis was performed for many maggots of known age and length that represented wild maggots to demonstrate how often the models accurately captured the true age of the maggots as well as how often the maggots of a true age were captured in incorrect ages. This information was used to conclude later generations of maggots' growth curves do not mimic those of wild maggots.

The monthly conservice bill at Campus Crossings Apartment, Lafayette, Louisiana, USA
Daniella Tran, Suntaree Unhapipat (advisor Sungsu Kim)
University of Louisiana at Lafayette
Lafayette, Louisiana

The investigators are residents of Campus Crossings Apartments, Lafayette, Louisiana. Sometimes, at the end of month, we receive the conservice bills from the leasing center about the extra money we are claimed to pay for the exceed amount usage of water and electricity. However, the money shown up on the conservice bill varies between months, apartment units while some residents claim that they have been saving water and electricity. Our concern is "Why is there a variability on the conservice bills?". In this project, we try to answer this question by means of an experimental analysis.
The way in which we proceeded to do this project was to conduct a designed experiment on the apartment units at Campus Crossings considering the different factors that affect the amount of money displayed on the conservice bills. The details of the data and data collection, choice of factors, response variable, experimental design, performing the experiment and the analysis are included in the report.