Statistics Seminar
The Statistics Seminar has talks on a variety of topics. For more information contact Yongli Sang.
Fall 2021
During the Fall 2021 semester we will meet on Friday from 1:002:00. Our plan is to meet in person in Maxim Doucet Hall room 201 while allowing those who cannot join us physically to join us via Zoom.
For more information or connection details contact Yongli Sang.

10 September 2021 (on zoom only)
Two Symmetric and Computationally Efficient Gini Correlations
Courtney Vanderford Michael
University of Mississippi
Abstract: The standard Gini correlation plays an important role in measuring the dependence between random variables with heavytailed distributions. It is based on the covariance between one variable and the rank of the other. Hence, for each pair of random variables, there are two Gini correlations, and they are not equal in general, which brings a substantial difficulty in interpretation. Recently, Sang et al (2016) proposed a symmetric Gini correlation based on the joint spatial rank function with a computation cost of O(n^2) where n is the sample size. We study two symmetric and computationally efficient Gini correlations with the computational complexity of O(n log n). The properties of the new symmetric Gini correlations are explored. The influence function approach is utilized to study the robustness and the asymptotic behavior of these correlations. The asymptotic relative efficiencies are considered to compare several popular correlations under symmetric distributions with different tailheaviness as well as an asymmetric lognormal distribution, and bivariate Pareto distributions due to their usefulness in modelling lifetime data, hydrology, competing risk data, and many other nonnegative socioeconomic issues. Simulation and real data application are conducted to demonstrate the desirable performance of the two new symmetric Gini correlations.
About the speaker: Courtney Vanderford Michael was born in Tupelo, MS. She received her B.S. in Mathematics from Blue Mountain College in 2017 and M.S. in Mathematics from the University of Mississippi in 2019. She is currently a Ph.D student studying Statistics at the University of Mississippi under the guidance of Dr. Xin Dang. 
24 September 2021 (on zoom only)
Negative Binomial Distributions: Confidence Intervals, Prediction Intervals and Tolerance Intervals
BaoAnh Dang
UL Lafayette
Abstract: In this talk, we describe the construction of confidence intervals (CIs) for a proportion, prediction intervals (PIs) for a future sample size in a negative binomial sampling to observe a specified number of successes and tolerance intervals (TIs) for negative binomial distributions. For intervals estimating the success probability, we propose CIs based on the fiducial approach and the score method, evaluate them and compare them with available CIs with respect to coverage probability and precision. We propose PIs based on the fiducial approach and joint sampling approach, and compare them with the exact and other approximate PIs. We also propose TIs on the basis of our new CIs and evaluate them with respect to coverage probability and expected width. We illustrate all three statistical intervals using two examples with real data. 
15 October 2021 (on zoom only)
Bayesian jackknife empirical likelihood
Yichuan Zhao
Georgia State University
Abstract:Empirical likelihood is a very powerful nonparametric tool that does not require any distributional assumptions. Lazar (2003) showed that in Bayesian inference, if one replaces the usual likelihood with the empirical likelihood, then posterior inference is still valid when the functional of interest is a smooth function of the posterior mean. However, it is not clear whether similar conclusions can be obtained for parameters defined in terms of $U$statistics. We propose the socalled Bayesian jackknife empirical likelihood, which replaces the likelihood component with the jackknife empirical likelihood. We show, both theoretically and empirically, the validity of the proposed method as a general tool for Bayesian inference. Empirical analysis shows that the smallsample performance of the proposed method is better than its frequentist counterpart. Analysis of a casecontrol study for pancreatic cancer is used to illustrate the new approach.
About the speaker: Professor Yichuan Zhao is a full professor of statistics at Georgia State University in Atlanta. He has a joint appointment as associate member of the Neuroscience Institute, and he is also an affiliated faculty member of the School of Public Health at Georgia State University. His current research interest focuses on survival analysis, empirical likelihood methods, nonparametric statistics, analysis of ROC curves, bioinformatics, Monte Carlo methods, and statistical modelling of fuzzy systems. He has published 100 research articles in statistics and biostatistics, has coedited four books on statistics, biostatistics and data science, and has been invited to deliver more than 200 research talks nationally and internationally. Dr. Zhao has organized the Workshop Series on Biostatistics and Bioinformatics since its initiation in 2012. He also organized the 25th ICSA Applied Statistics Symposium in Atlanta as a chair of the organizing committee to great success. He is currently serving as associate editor, or on the editorial board, for several statistical journals. Dr. Zhao is a Fellow of the American Statistical Association, an elected member of the International Statistical Institute, and serves on the Board of Directors, ICSA. 
date to be determined
A mixture binary randomized response technique model with a unified measure of privacy and efficiency
Maxwell Lovig
UL Lafayette
Abstract: In this talk, I will introduce a mixture binary Randomized Response Technique (RRT) model by combining the elements of the Greenberg Unrelated Question model and the Warner Indirect Question model. This model will also account for untruthful responses. A unified measure of model efficiency and respondent privacy will be discussed. I will also provide the results of a simulation study to validate the theoretical findings.
This talk is based on joint work with Sadia Khalil, Sumaita Rahman, Pujita Sapra, and Sat Gupta.
Fall 2019
During the Fall 2019 semester we will meet on Friday from 11:0012:00 in Maxim Doucet Hall room 212.

27 September 2019
Single Linkage Clustering of Univariate Distributions
Calvin Berry
UL Lafayette
Abstract: There are several methods in use for numerically evaluating the dominance D(G,F) of one univariate distribution over another. We consider the suitability of several such measures for ordering and grouping of the elements of a finite collection of univariate distributions. One of these measures is shown to have the desirable property that any finite collection of distributions can be arranged in a sequence for which (a) F preceding G implies D(G,F) \geq D(F,G) and (b) each single linkage cluster formed using D(G,F)D(F,G) as the dissimilarity between F and G consists of distributions contiguous in the sequence. 
18 October 2019
Bayesian variable selection in semicompeting risks models
Andrew Chapple
LSU Health, New Orleans
Abstract: Conventionally, evaluation of a new drug, A, is done in three phases. Phase I is based on toxicity to determine a “maximum tolerable dose” (MTD) of A, phase II is conducted to decide whether A at the MTD is promising in terms of response probability, and if so a large randomized phase III trial is conducted to compare A to a control treatment, urn:xwiley:15410420:media:biom12994:biom12994math0001 usually based on survival time or progression free survival time. It is widely recognized that this paradigm has many flaws. A recent approach combines the first two phases by conducting a phase I‐II trial, which chooses an optimal dose based on both efficacy and toxicity, and evaluation of A at the selected optimal phase I‐II dose then is done in a phase III trial. This paper proposes a new design paradigm, motivated by the possibility that the optimal phase I‐II dose may not maximize mean survival time with A. We propose a hybridized design, which we call phase I‐II/III, that combines phase I‐II and phase III by allowing the chosen optimal phase I‐II dose of A to be re‐optimized based on survival time data from phase I‐II patients and the first portion of phase III. The phase I‐II/III design uses adaptive randomization in phase I‐II, and relies on a mixture model for the survival time distribution as a function of efficacy, toxicity, and dose. A simulation study is presented to evaluate the phase I‐II/III design and compare it to the usual approach that does not re‐optimize the dose of A in phase III. 
15 November 2019
Applications of Jackknife empirical likelihood via energy distance, part 1
Yongli Sang
UL Lafayette
Abstract: Energy distance is a statistical distance between the distributions of random vectors, which characterizes equality of distributions. Empirical likelihood (EL) method is a classical nonparametric method, and it combines the reliability of the nonparametric methods with the flexibility and effectiveness of the likelihood approach. However, EL loses this efficiency when some nonlinear constraints are involved. The jackknife empirical likelihood (JEL) method can overcome this computational difficulty. The JEL approach is extremely simple to use in practice and is very effective in handing Ustatistics.
The energy distance can be estimated by functions of Ustatistics, which motivated us to apply JEL to energy distance to develop new efficient and powerful tests:
(1) Goodnessoffit
(2) Central symmetry 
22 November 2019
Applications of Jackknife empirical likelihood via energy distance, part 2
Yongli Sang
UL Lafayette
Abstract: Energy distance is a statistical distance between the distributions of random vectors, which characterizes equality of distributions. Empirical likelihood (EL) method is a classical nonparametric method, and it combines the reliability of the nonparametric methods with the flexibility and effectiveness of the likelihood approach. However, EL loses this efficiency when some nonlinear constraints are involved. The jackknife empirical likelihood (JEL) method can overcome this computational difficulty. The JEL approach is extremely simple to use in practice and is very effective in handing Ustatistics.
The energy distance can be estimated by functions of Ustatistics, which motivated us to apply JEL to energy distance to develop new efficient and powerful tests:
(1) Goodnessoffit
(2) Central symmetry
Spring 2019
During the Spring 2019 semester we will meet on Friday from 10:0011:00 in Maxim Doucet Hall room 209.

22 February 2019
Jackknife Empirical Likelihood Approach for Ksample Tests via Energy Distance
Yongli Sang
Abstract: Energy distance is a statistical distance between the distributions of random variables, which characterizes the equality of the distributions. Utilizing the energy distance, we develop a nonparametric test for the equality of K (K at least 2) distributions in this talk. By applying the jackknife empirical likelihood approach, the standard limiting chisquare distribution with degree freedom of K1 is established and is used to determine critical value and pvalue of the test. Simulation studies show that our method is competitive to existing methods in terms of power of the tests in most cases. The proposed method is illustrated in an application on a real data set. 
14 March 2019 (THURSDAY in room 208)
Fourier transform and project methods in kernel entropy estimation for linear processes
Hailin Sang
University of Mississippi
Abstract: Entropy is widely applied in the fields of information theory, statistical classification, pattern recognition and so on since it is a measure of uncertainty in a probability distribution. The quadratic functional plays an important role in the study of quadratic Renyi entropy and the Shannon entropy. It is a challenging problem to study the estimation of the quadratic functional and the corresponding entropies for dependent case. In this talk, we consider the estimation of the quadratic functional for linear processes. With a Fourier transform on the kernel function and the projection method, it is shown that, the kernel estimator has similar asymptotical properties as the i.i.d. case studied in Gine and Nickl (2008) if the linear process (X_n: n \in N) has the defined short range dependence. We also provide an application to L_2^2 divergence and the extension to multivariate linear processes. The simulation study for linear processes with Gaussian and \alphastable innovations confirms the theoretical results. As an illustration, we estimate the L_2^2 divergences among the density functions of average annual river flows for four rivers and obtain promising results. This is a joint work with Yongli Sang and Fangjun Xu.
Fall 2018
During the Fall 2018 semester we will meet on Friday from 11:0012:00 in Maxim Doucet room 212.

31 August 2018
Fiducial Inference with Applications
Kalimuthu Krishnamoorthy
Abstract: Fiducial distribution for a parameter is essentially the posterior distribution with no a prior distribution on parameters. In this talk, we shall describe Fisher's method of finding a fiducial distribution for a parameter and fiducial inference through examples involving wellknown distributions such as the normal and related distributions. We then describe the approach for finding fiducial distributions for the parameters of a locationscale family and illustrate the approach for the Weibull distribution. In particular, we shall see fiducial methods for finding confidence intervals, prediction intervals, prediction limits for the mean of a future sample. All the methods will be illustrated using some practical examples. 
7 September 2018
Fiducial Inference with Applications, part 2
Kalimuthu Krishnamoorthy
Abstract: In the second part of this seminar series, we shall develop fiducial distributions for gamma parameters and show some applications. We then provide fiducial solutions for correlation analysis in a multivariate normal setup. For discrete distributions, we outline two different approaches of finding fiducial distributions, and illustrate the methods for the binomial, Poisson and hypergeometric distributions. Advantages our fiducial approach over other large sample approaches will be illustrated through some applications. Finally, fiducial inference for a mixture distribution will be described. 
14 September 2018
Jackknife Empirical Likelihood for Gini Correlations
Yongli Sang
Abstract: The Gini correlation plays an important role in measuring dependence of random variables with heavy tailed distributions, whose properties are a mixture of Pearson's and Spearman's correlations. Due to the structure of this dependence measure, there are two Gini correlations between each pair of random variables, which are not equal in general. Both the Gini correlation and the equality of the two Gini correlations play important roles in Economics. In the literature, there are limited papers focusing on the inference of the Gini correlations and their equality testing. We have developed the jackknife empirical likelihood (JEL) approach for the single Gini correlation, for testing the equality of the two Gini correlations, and for the Gini correlations' differences of two independent samples. The standard limiting chisquare distributions of those jackknife empirical likelihood ratio statistics are established and used to construct confidence intervals, rejection regions, and to calculate $p$values of the tests. 
21 September 2018
The Ultimate Antithesis of Asymptotic Theory: Estimation with a Sample of Size 1
Nabendu Pal
Abstract: Many statistical results depend heavily on the asymptotic theory which provides us a guidance by using a 'large sample' approach. This is also the foundation of several widely used tools like the Central Limit Theorem or the Laws of Large Numbers. In this seminar talk we will explore the total opposite of the asymptotic theory that deals with statistical inferences with a single observation. We are going to review some interesting existing results, and discuss about potential open problems. 
28 September 2018
Correlation and regression analyses involving circular variables
Sungsu Kim
Abstract: Bivariate data involving circular variables arise in many areas of research. Some examples are: wind directions at 6 am and at noon at an observatory station, dihedral angles in the protein folding problem, positions of homologous genes in two circular RNAs, phase angles between two living tissues, amount of rain fall and wind direction, and orientation of bird’s nest and direction of creek flow. In this talk, I will present correlation and regression analyses of bivariate data involving one or both circular variables. 
19 October 2018
Highest posterior mass prediction intervals for binomial and poisson distributions
Shanshan Lv
Abstract: The problems of constructing prediction intervals(PIs) for the binomial and Poisson distributions are considered. New highest posterior mass (HPM) PIs based on fiducial approach are proposed. Other fiducial PIs, an exact PI and approximate PIs are reviewed and compared with the HPMPIs. Exact coverage studies and expected widths of prediction intervals show that the new prediction intervals are less conservative than other fiducial PIs and comparable with the approximate one based on the joint sampling approach for the binomial case. For the Poisson case, the HPMPIs are better than the other PIs in terms of coverage probabilities and precision. The methods are illustrated using some practical examples. 
26 October 2018
Confidence intervals for the mean and a percentile based on zeroinflated lognormal data
Md Sazib Hasan
Abstract: The problems of estimating the mean and an upper percentile of a lognormal population with nonnegative values are considered. For estimating the mean of a such population based on data that include zeros, a simple confidence interval (CI) that is obtained by modifying Tian’s [Inferences on the mean of zeroinflated lognormal data: the generalized variable approach. Stat Med. 2005;24:3223—3232] generalized CI, is proposed. A fiducial upper confidence limit (UCL) and a closedform approximate UCL for an upper percentile are developed. Our simulation studies indicate that the proposed methods are very satisfactory in terms of coverage probability and precision, and better than existing methods for maintaining balanced tail error rates. The proposed CI and the UCL are simple and easy to calculate. All the methods considered are illustrated using samples of data involving airborne chlorine concentrations and data on diagnostic test costs.
Spring 2018
During the Spring 2018 semester we will meet on Fridays from 2:003:50 in Maxim Doucet Hall room 211.

23 February 2018
An Introduction to Circular Statistics
Sungsu Kim
Abstract: In many diverse scientific fields, the measurements are directions. Examples are directions of flight of a bird in Biology, of wind in Meteorology, of protein folding in Bioinformatics, of knee flexion in Medicine, etc. In this first talk of a series on Circular Statistics, I will start with those unique features and challenges dealing with circular data, then discuss some summary measures in Circular Statistics. 
9 March 2018
Probability Models for Circular Data
Sungsu Kim
Abstract: In this talk, I will first go over some of the methods that one can construct a circular probability distribution and summarize some common distributions used in Circular Statistics. Then, I will discuss some properties of the asymmetric generalized von Mises (AGvM) distribution proposed in Kim and SenGupta (2013). A real data example will be provided to illustrate the practical utility of the AGvM distribution. 
16 March 2018
Inferences for a SkewNormal Distribution
Phontita Thiuthad
Abstract: A three parameter SkewNormal distribution (SND), which is an interesting generalization of the usual two parameter normal distribution, is getting a lot of attention lately due to its flexibility to accommodate both positively skewed as well as negatively skewed shapes, apart from the symmetric shape, to model various types of datasets. Though a lot of work has been done to characterize SND, and deriving many of its distributional properties, relatively less efforts have been devoted to inferences on the model parameters due to some intrinsic complexities. In this talk we will focus on estimation of the location parameter along with some other interesting results. 
23 March 2018
Hierarchical Bayesian Models for Continuous and Positively Skewed Data From Small Areas
Binod Manandhar
University of Houston
Abstract:The logtransformation is widely used to deal with skewed data, however it could be problematic due to the back transformation. In this talk, I will present hierarchical Bayesian models for continuous and positively skewed random variable without logarithmic transformation using three distributions: exponential, gamma and generalized gamma. In these models, a second order Taylor series Laplace approximation is used to ease computational difficulties due to complex forms of the posterior and conditional posterior distributions. The utility of the proposed models will be illustrated using the generalized gamma model applied to small area estimations in the Nepal census data. 
13 April 2018
Memory properties of transformations of linear processes
Yongli Sang
Abstract: We study the memory properties of transformations of linear processes. Dittmann and Granger (2002) studied the polynomial transformations of Gaussian FARIMA(0,d,0) processes by applying the orthonormality of the Hermite polynomials under the measure for the standard normal distribution. Nevertheless, the orthogonality does not hold for transformations of nonGaussian linear processes. Instead, we use the decomposition developed by Ho and Hsing (1996, 1997) to study the memory properties of nonlinear transformations of linear processes, which include the FARIMA(p,d,q) processes, and obtain consistent results as in the Gaussian case. In particular, for stationary processes, the transformations of shortmemory time series still have shortmemory and the transformation of longmemory time series may have different weaker memory parameters which depend on the power rank of the transformation. On the other hand, the memory properties of transformations of nonstationary time series may not depend on the power ranks of the transformations. This study has application in econometrics and financial data analysis when the time series observations have nonGaussian heavy tails. 
20 April 2018
R Markdown
Thu Nguyen
Abstract: R Markdown provides an authoring framework for data science and statistics. You can use a single R Markdown file to save and execute code, generate high quality reports that can be shared with an audience. R Markdown support multiple languages including R, Python, and SQL, as well as dozens of static and dynamic output formats including HTML, PDF, MS Word, Beamer, HTML5 slides, Tuftestyle handouts, books, dashboards, shiny applications, scientific articles, websites, and more.