You are here

Statistics Seminar

The Statistics Seminar has talks on a variety of topics. For more information contact Yongli Sang.

Fall 2023

During the Fall 2023 semester we will meet on Friday from 1:00-2:00 in Maxim Doucet Hall room 212.

For more information contact Yongli Sang.

  • 22 September 2023
    Jackknife empirical likelihood methods for the Gini mean difference
    Sameera Hewage
    UL Lafayette
    Abstract: The Gini mean difference (GMD) is often utilized as an alternative measure of variability. The variance is convenient and superior to the GMD for distributions that are nearly normal, and the GMD reveals more information about the underlined distributions which are far from normal distributions. However, in comparison to the variance, the GMD has not been widely used as an index of variability because of the difficulties in computing and estimating the variance of its estimator. In this paper, we study the inference for the GMD by utilization of the nonparametric method, jackknife empirical likelihood (JEL). We then develop a JEL procedure to test the equality of two GMDs. The standard limiting Chi-squared distributions are established for both one-sample and two-sample problems. Confidence intervals for the one-sample GMD and p-values for the two-sample GMDs are computed without estimating the asymptotic variance. Adjusted and weighted JEL methods are explored to improve the performance of the standard JEL methods. The proposed methods work well in a variety of realistic settings, according to simulation studies and real data analysis.
  • 17 November 2023
    Changepoint Analysis of Hourly Sky-cloudiness Conditions in Canada
    Mo Li
    UL Lafayette
    Abstract: Changepoint analysis of non-stationary ordinal categorical time series is always of interest in climate studies. For instance, the sky-cloudiness condition in Canada is reported hourly in terms of tenths of the sky dome covered by clouds, hence having 11 ordinal categories. The cloud cover data often contain changepoints and exhibit temporal trends, seasonality, and serial correlation in nature. To properly take into account these features, a likelihood ratio-type statistic is proposed in this talk to test for a single changepoint in hourly categorical time series using a marginalized transition model. This model allows for likelihood-based inference, and the series dependence is specified via a first-order Markov chain. An application of our method is illustrated using the hourly sky-cloudiness conditions at Fort St. John Airport in Canada, and a stochastic optimization algorithm is adapted to reduce the computation time.

Spring 2023

During the Spring 2023 semester we will meet on Friday from 1:00-2:00 in Maxim Doucet Hall room 212.

For more information contact Yongli Sang.

  • 10 March 2023
    Confidence Intervals and Prediction Intervals for Two-Parameter Negative Binomial Distributions
    Md Mahadi Hasan
    UL Lafayette
    Abstract: Problems of finding confidence intervals (CIs) and prediction intervals (PIs) for two-parameter negative binomial distributions are considered. Simple CIs for the mean of a two-parameter negative binomial distribution based on some large sample methods are proposed and compared with the likelihood CIs. Proposed CIs are not only simple to compute, but also better than the likelihood CIs for moderate sample sizes. Approximate PIs for the mean of a future sample from a negative binomial distribution are also proposed and evaluated for their accuracy. The methods are illustrated using a few examples with real life data sets.

Fall 2022

During the Fall 2022 semester we will meet on Friday from 1:30-2:30 in Maxim Doucet Hall room 212.

For more information contact Yongli Sang.

  • 30 September 2022
    Statistical Intervals for Maxwell Distributions
    Faysal A. Chowdhury
    UL Lafayette
    Abstrect:The problem of constructing statistical intervals for two-parameter Maxwell distribution is considered. An appropriate method of finding the maximum likelihood estimators (MLEs) is proposed. Constructions of confidence intervals, prediction intervals (PIs) and one-sided tolerance limits based on suitable pivotal quantities are described. Pivotal quantities based on the MLEs and moment estimators are proposed and compared the statistical intervals based on them in terms of expected widths. Comparison studies indicate that the statistical intervals based on the MLEs offer little improvements over interval estimates based on moment estimates when sample sizes are small, and all intervals are practically the same even for moderate sample sizes. The methods are illustrated using two examples involving real data sets.

Spring 2022

During the Spring 2022 semester we will meet on Friday from 1:00-2:00. Our plan is to meet in person in Maxim Doucet Hall room 212 while allowing those who cannot join us physically to join us via Zoom.

For more information or connection details contact Yongli Sang.

  • 11 March 2022
    Combining Independent Tests for a Common Parameter of Several Continuous Distributions
    Shanshan Lv
    Truman State University
    Abstract: The problem of testing a common parameter of several independent continuous populations is considered. Among all tests, Fisher's combined test is the most popular one and is routinely used in applications. In this article, we propose an alternative method of combining the p-values of independent tests using chi-square scores, referred to as the inverse chi-square test. The proposed test is as simple as other existing tests. We compare the powers of the combined tests for (i) testing a common mean of several normal populations, (ii) testing the common coefficient of variation of several normal populations, (iii) testing the common correlation coefficient of several bivariate normal populations, (iv) testing the common mean of several lognormal populations and (v) testing the common mean of several gamma distributions. Our comparison studies indicate that the inverse chi-square test is a better alternative combined test with good power properties. An illustrative example with real-world data is given for each problem.
  • 8 April 2022
    Asymptotic Normality of Gini Correlation in High Dimension with Applications to the K-sample Problem
    Yongli Sang
    UL Lafayette
    Abstract: The categorical Gini correlation proposed by Dang et al. is a dependence measure between a categorical and a numerical variable, which can characterize independence of the two variables. The asymptotic distributions of the sample correlation under the dependence and independence have been established when the dimension of the numerical variable is fixed. However, its asymptotic distribution for high dimensional data has not been explored. In this paper, we develop the central limit theorem for the Gini correlation for the more realistic setting where the dimensionality of the numerical variable is diverging. We then construct a powerful and consistent test for the K-sample problem based on the asymptotic normality. The proposed test not only avoids computation burden but also gains power over the permutation procedure. Simulation studies and real data illustrations show that the proposed test is more competitive to existing methods across a broad range of realistic situations, especially in unbalanced cases.

Fall 2021

During the Fall 2021 semester we will meet on Friday from 1:00-2:00. Our plan is to meet in person in Maxim Doucet Hall room 201 while allowing those who cannot join us physically to join us via Zoom.

For more information or connection details contact Yongli Sang.

  • 10 September 2021 (on zoom only)
    Two Symmetric and Computationally Efficient Gini Correlations
    Courtney Vanderford Michael
    University of Mississippi
    Abstract: The standard Gini correlation plays an important role in measuring the dependence between random variables with heavy-tailed distributions. It is based on the covariance between one variable and the rank of the other. Hence, for each pair of random variables, there are two Gini correlations, and they are not equal in general, which brings a substantial difficulty in interpretation. Recently, Sang et al (2016) proposed a symmetric Gini correlation based on the joint spatial rank function with a computation cost of O(n^2) where n is the sample size. We study two symmetric and computationally efficient Gini correlations with the computational complexity of O(n log n). The properties of the new symmetric Gini correlations are explored. The influence function approach is utilized to study the robustness and the asymptotic behavior of these correlations. The asymptotic relative efficiencies are considered to compare several popular correlations under symmetric distributions with different tail-heaviness as well as an asymmetric log-normal distribution, and bivariate Pareto distributions due to their usefulness in modelling lifetime data, hydrology, competing risk data, and many other non-negative socio-economic issues. Simulation and real data application are conducted to demonstrate the desirable performance of the two new symmetric Gini correlations.
    About the speaker: Courtney Vanderford Michael was born in Tupelo, MS. She received her B.S. in Mathematics from Blue Mountain College in 2017 and M.S. in Mathematics from the University of Mississippi in 2019. She is currently a Ph.D student studying Statistics at the University of Mississippi under the guidance of Dr. Xin Dang.
  • 24 September 2021 (on zoom only)
    Negative Binomial Distributions: Confidence Intervals, Prediction Intervals and Tolerance Intervals
    Bao-Anh Dang
    UL Lafayette
    Abstract: In this talk, we describe the construction of confidence intervals (CIs) for a proportion, prediction intervals (PIs) for a future sample size in a negative binomial sampling to observe a specified number of successes and tolerance intervals (TIs) for negative binomial distributions. For intervals estimating the success probability, we propose CIs based on the fiducial approach and the score method, evaluate them and compare them with available CIs with respect to coverage probability and precision. We propose PIs based on the fiducial approach and joint sampling approach, and compare them with the exact and other approximate PIs. We also propose TIs on the basis of our new CIs and evaluate them with respect to coverage probability and expected width. We illustrate all three statistical intervals using two examples with real data.
  • 15 October 2021 (on zoom only)
    Bayesian jackknife empirical likelihood
    Yichuan Zhao
    Georgia State University
    Abstract:Empirical likelihood is a very powerful nonparametric tool that does not require any distributional assumptions. Lazar (2003) showed that in Bayesian inference, if one replaces the usual likelihood with the empirical likelihood, then posterior inference is still valid when the functional of interest is a smooth function of the posterior mean. However, it is not clear whether similar conclusions can be obtained for parameters defined in terms of $U$-statistics. We propose the so-called Bayesian jackknife empirical likelihood, which replaces the likelihood component with the jackknife empirical likelihood. We show, both theoretically and empirically, the validity of the proposed method as a general tool for Bayesian inference. Empirical analysis shows that the small-sample performance of the proposed method is better than its frequentist counterpart. Analysis of a case-control study for pancreatic cancer is used to illustrate the new approach.
    About the speaker: Professor Yichuan Zhao is a full professor of statistics at Georgia State University in Atlanta. He has a joint appointment as associate member of the Neuroscience Institute, and he is also an affiliated faculty member of the School of Public Health at Georgia State University. His current research interest focuses on survival analysis, empirical likelihood methods, nonparametric statistics, analysis of ROC curves, bioinformatics, Monte Carlo methods, and statistical modelling of fuzzy systems. He has published 100 research articles in statistics and biostatistics, has co-edited four books on statistics, biostatistics and data science, and has been invited to deliver more than 200 research talks nationally and internationally. Dr. Zhao has organized the Workshop Series on Biostatistics and Bioinformatics since its initiation in 2012. He also organized the 25th ICSA Applied Statistics Symposium in Atlanta as a chair of the organizing committee to great success. He is currently serving as associate editor, or on the editorial board, for several statistical journals. Dr. Zhao is a Fellow of the American Statistical Association, an elected member of the International Statistical Institute, and serves on the Board of Directors, ICSA.
  • date to be determined 
    A mixture binary randomized response technique model with a unified measure of privacy and efficiency
    Maxwell Lovig
    UL Lafayette
    Abstract: In this talk, I will introduce a mixture binary Randomized Response Technique (RRT) model by combining the elements of the Greenberg Unrelated Question model and the Warner Indirect Question model. This model will also account for untruthful responses. A unified measure of model efficiency and respondent privacy will be discussed. I will also provide the results of a simulation study to validate the theoretical findings.
    This talk is based on joint work with Sadia Khalil, Sumaita Rahman, Pujita Sapra, and Sat Gupta.

Fall 2019

During the Fall 2019 semester we will meet on Friday from 11:00-12:00 in Maxim Doucet Hall room 212.

  • 27 September 2019
    Single Linkage Clustering of Univariate Distributions
    Calvin Berry
    UL Lafayette
    Abstract: There are several methods in use for numerically evaluating the dominance D(G,F) of one univariate distribution over another. We consider the suitability of several such measures for ordering and grouping of the elements of a finite collection of univariate distributions. One of these measures is shown to have the desirable property that any finite collection of distributions can be arranged in a sequence for which (a) F preceding G implies D(G,F) \geq D(F,G) and (b) each single linkage cluster formed using |D(G,F)-D(F,G)| as the dissimilarity between F and G consists of distributions contiguous in the sequence.
  • 18 October 2019
    Bayesian variable selection in semi-competing risks models
    Andrew Chapple
    LSU Health, New Orleans
    Abstract: Conventionally, evaluation of a new drug, A, is done in three phases. Phase I is based on toxicity to determine a “maximum tolerable dose” (MTD) of A, phase II is conducted to decide whether A at the MTD is promising in terms of response probability, and if so a large randomized phase III trial is conducted to compare A to a control treatment, urn:x-wiley:15410420:media:biom12994:biom12994-math-0001 usually based on survival time or progression free survival time. It is widely recognized that this paradigm has many flaws. A recent approach combines the first two phases by conducting a phase I‐II trial, which chooses an optimal dose based on both efficacy and toxicity, and evaluation of A at the selected optimal phase I‐II dose then is done in a phase III trial. This paper proposes a new design paradigm, motivated by the possibility that the optimal phase I‐II dose may not maximize mean survival time with A. We propose a hybridized design, which we call phase I‐II/III, that combines phase I‐II and phase III by allowing the chosen optimal phase I‐II dose of A to be re‐optimized based on survival time data from phase I‐II patients and the first portion of phase III. The phase I‐II/III design uses adaptive randomization in phase I‐II, and relies on a mixture model for the survival time distribution as a function of efficacy, toxicity, and dose. A simulation study is presented to evaluate the phase I‐II/III design and compare it to the usual approach that does not re‐optimize the dose of A in phase III.
  • 15 November 2019
    Applications of Jackknife empirical likelihood via energy distance, part 1
    Yongli Sang
    UL Lafayette
    Abstract: Energy distance is a statistical distance between the distributions of random vectors, which characterizes equality of distributions. Empirical likelihood (EL) method is a classical nonparametric method, and it combines the reliability of the nonparametric methods with the flexibility and effectiveness of the likelihood approach. However, EL loses this efficiency when some nonlinear constraints are involved. The jackknife empirical likelihood (JEL) method can overcome this computational difficulty. The JEL approach is extremely simple to use in practice and is very effective in handing U-statistics.
    The energy distance can be estimated by functions of U-statistics, which motivated us to apply JEL to energy distance to develop new efficient and powerful tests:
    (1) Goodness-of-fit
    (2) Central symmetry
  • 22 November 2019
    Applications of Jackknife empirical likelihood via energy distance, part 2
    Yongli Sang
    UL Lafayette
    Abstract: Energy distance is a statistical distance between the distributions of random vectors, which characterizes equality of distributions. Empirical likelihood (EL) method is a classical nonparametric method, and it combines the reliability of the nonparametric methods with the flexibility and effectiveness of the likelihood approach. However, EL loses this efficiency when some nonlinear constraints are involved. The jackknife empirical likelihood (JEL) method can overcome this computational difficulty. The JEL approach is extremely simple to use in practice and is very effective in handing U-statistics.
    The energy distance can be estimated by functions of U-statistics, which motivated us to apply JEL to energy distance to develop new efficient and powerful tests:
    (1) Goodness-of-fit
    (2) Central symmetry

Spring 2019

During the Spring 2019 semester we will meet on Friday from 10:00-11:00 in Maxim Doucet Hall room 209.

  • 22 February 2019
    Jackknife Empirical Likelihood Approach for K-sample Tests via Energy Distance
    Yongli Sang
    Abstract: Energy distance is a statistical distance between the distributions of random variables, which characterizes the equality of the distributions. Utilizing the energy distance, we develop a nonparametric test for the equality of K (K at least 2) distributions in this talk. By applying the jackknife empirical likelihood approach, the standard limiting chi-square distribution with degree freedom of K-1 is established and is used to determine critical value and p-value of the test. Simulation studies show that our method is competitive to existing methods in terms of power of the tests in most cases. The proposed method is illustrated in an application on a real data set.
  • 14 March 2019 (THURSDAY in room 208)
    Fourier transform and project methods in kernel entropy estimation for linear processes
    Hailin Sang
    University of Mississippi
    Abstract: Entropy is widely applied in the fields of information theory, statistical classification, pattern recognition and so on since it is a measure of uncertainty in a probability distribution. The quadratic functional plays an important role in the study of quadratic Renyi entropy and the Shannon entropy. It is a challenging problem to study the estimation of the quadratic functional and the corresponding entropies for dependent case. In this talk, we consider the estimation of the quadratic functional for linear processes. With a Fourier transform on the kernel function and the projection method, it is shown that, the kernel estimator has similar asymptotical properties as the i.i.d. case studied in Gine and Nickl (2008) if the linear process (X_n: n \in N) has the defined short range dependence. We also provide an application to L_2^2 divergence and the extension to multivariate linear processes. The simulation study for linear processes with Gaussian and \alpha-stable innovations confirms the theoretical results. As an illustration, we estimate the L_2^2 divergences among the density functions of average annual river flows for four rivers and obtain promising results. This is a joint work with Yongli Sang and Fangjun Xu.

Fall 2018

During the Fall 2018 semester we will meet on Friday from 11:00-12:00 in Maxim Doucet room 212.

  • 31 August 2018
    Fiducial Inference with Applications
    Kalimuthu Krishnamoorthy
    Abstract: Fiducial distribution for a parameter is essentially the posterior distribution with no a prior distribution on parameters. In this talk, we shall describe Fisher's method of finding a fiducial distribution for a parameter and fiducial inference through examples involving well-known distributions such as the normal and related distributions. We then describe the approach for finding fiducial distributions for the parameters of a location-scale family and illustrate the approach for the Weibull distribution. In particular, we shall see fiducial methods for finding confidence intervals, prediction intervals, prediction limits for the mean of a future sample. All the methods will be illustrated using some practical examples.
  • 7 September 2018
    Fiducial Inference with Applications, part 2
    Kalimuthu Krishnamoorthy
    Abstract: In the second part of this seminar series, we shall develop fiducial distributions for gamma parameters and show some applications. We then provide fiducial solutions for correlation analysis in a multivariate normal setup. For discrete distributions, we outline two different approaches of finding fiducial distributions, and illustrate the methods for the binomial, Poisson and hypergeometric distributions. Advantages our fiducial approach over other large sample approaches will be illustrated through some applications. Finally, fiducial inference for a mixture distribution will be described.
  • 14 September 2018
    Jackknife Empirical Likelihood for Gini Correlations
    Yongli Sang
    Abstract: The Gini correlation plays an important role in measuring dependence of random variables with heavy tailed distributions, whose properties are a mixture of Pearson's and Spearman's correlations. Due to the structure of this dependence measure, there are two Gini correlations between each pair of random variables, which are not equal in general. Both the Gini correlation and the equality of the two Gini correlations play important roles in Economics. In the literature, there are limited papers focusing on the inference of the Gini correlations and their equality testing. We have developed the jackknife empirical likelihood (JEL) approach for the single Gini correlation, for testing the equality of the two Gini correlations, and for the Gini correlations' differences of two independent samples. The standard limiting chi-square distributions of those jackknife empirical likelihood ratio statistics are established and used to construct confidence intervals, rejection regions, and to calculate $p$-values of the tests.
  • 21 September 2018
    The Ultimate Antithesis of Asymptotic Theory: Estimation with a Sample of Size 1
    Nabendu Pal
    Abstract: Many statistical results depend heavily on the asymptotic theory which provides us a guidance by using a 'large sample' approach. This is also the foundation of several widely used tools like the Central Limit Theorem or the Laws of Large Numbers. In this seminar talk we will explore the total opposite of the asymptotic theory that deals with statistical inferences with a single observation. We are going to review some interesting existing results, and discuss about potential open problems.
  • 28 September 2018
    Correlation and regression analyses involving circular variables
    Sungsu Kim
    Abstract: Bivariate data involving circular variables arise in many areas of research. Some examples are: wind directions at 6 am and at noon at an observatory station, dihedral angles in the protein folding problem, positions of homologous genes in two circular RNAs, phase angles between two living tissues, amount of rain fall and wind direction, and orientation of bird’s nest and direction of creek flow. In this talk, I will present correlation and regression analyses of bivariate data involving one or both circular variables.
  • 19 October 2018
    Highest posterior mass prediction intervals for binomial and poisson distributions
    Shanshan Lv
    Abstract: The problems of constructing prediction intervals(PIs) for the binomial and Poisson distributions are considered. New highest posterior mass (HPM) PIs based on fiducial approach are proposed. Other fiducial PIs, an exact PI and approximate PIs are reviewed and compared with the HPM-PIs. Exact coverage studies and expected widths of prediction intervals show that the new prediction intervals are less conservative than other fiducial PIs and comparable with the approximate one based on the joint sampling approach for the binomial case. For the Poisson case, the HPM-PIs are better than the other PIs in terms of coverage probabilities and precision. The methods are illustrated using some practical examples.
  • 26 October 2018
    Confidence intervals for the mean and a percentile based on zero-inflated lognormal data
    Md Sazib Hasan
    Abstract: The problems of estimating the mean and an upper percentile of a lognormal population with nonnegative values are considered. For estimating the mean of a such population based on data that include zeros, a simple confidence interval (CI) that is obtained by modifying Tian’s [Inferences on the mean of zero-inflated lognormal data: the generalized variable approach. Stat Med. 2005;24:3223—3232] generalized CI, is proposed. A fiducial upper confidence limit (UCL) and a closed-form approximate UCL for an upper percentile are developed. Our simulation studies indicate that the proposed methods are very satisfactory in terms of coverage probability and precision, and better than existing methods for maintaining balanced tail error rates. The proposed CI and the UCL are simple and easy to calculate. All the methods considered are illustrated using samples of data involving airborne chlorine concentrations and data on diagnostic test costs.

Spring 2018

During the Spring 2018 semester we will meet on Fridays from 2:00-3:50 in Maxim Doucet Hall room 211.

  • 23 February 2018
    An Introduction to Circular Statistics
    Sungsu Kim
     
    Abstract: In many diverse scientific fields, the measurements are directions. Examples are directions of flight of a bird in Biology, of wind in Meteorology, of protein folding in Bioinformatics, of knee flexion in Medicine, etc. In this first talk of a series on Circular Statistics, I will start with those unique features and challenges dealing with circular data, then discuss some summary measures in Circular Statistics.
  • 9 March 2018
    Probability Models for Circular Data
    Sungsu Kim
     
    Abstract: In this talk, I will first go over some of the methods that one can construct a circular probability distribution and summarize some common distributions used in Circular Statistics. Then, I will discuss some properties of the asymmetric generalized von Mises (AGvM) distribution proposed in Kim and SenGupta (2013). A real data example will be provided to illustrate the practical utility of the AGvM distribution.
  • 16 March 2018
    Inferences for a Skew-Normal Distribution
    Phontita Thiuthad
     
    Abstract: A three parameter Skew-Normal distribution (SND), which is an interesting generalization of the usual two parameter normal distribution, is getting a lot of attention lately due to its flexibility to accommodate both positively skewed as well as negatively skewed shapes, apart from the symmetric shape, to model various types of datasets. Though a lot of work has been done to characterize SND, and deriving many of its distributional properties, relatively less efforts have been devoted to inferences on the model parameters due to some intrinsic complexities. In this talk we will focus on estimation of the location parameter along with some other interesting results.
  • 23 March 2018
    Hierarchical Bayesian Models for Continuous and Positively Skewed Data From Small Areas
    Binod Manandhar
    University of Houston
     
    Abstract:The log-transformation is widely used to deal with skewed data, however it could be problematic due to the back transformation. In this talk, I will present hierarchical Bayesian models for continuous and positively skewed random variable without logarithmic transformation using three distributions: exponential, gamma and generalized gamma. In these models, a second order Taylor series Laplace approximation is used to ease computational difficulties due to complex forms of the posterior and conditional posterior distributions. The utility of the proposed models will be illustrated using the generalized gamma model applied to small area estimations in the Nepal census data.
  • 13 April 2018
    Memory properties of transformations of linear processes
    Yongli Sang
     
    Abstract: We study the memory properties of transformations of linear processes. Dittmann and Granger (2002) studied the polynomial transformations of Gaussian FARIMA(0,d,0) processes by applying the orthonormality of the Hermite polynomials under the measure for the standard normal distribution. Nevertheless, the orthogonality does not hold for transformations of non-Gaussian linear processes. Instead, we use the decomposition developed by Ho and Hsing (1996, 1997) to study the memory properties of nonlinear transformations of linear processes, which include the FARIMA(p,d,q) processes, and obtain consistent results as in the Gaussian case. In particular, for stationary processes, the transformations of short-memory time series still have short-memory and the transformation of long-memory time series may have different weaker memory parameters which depend on the power rank of the transformation. On the other hand, the memory properties of transformations of non-stationary time series may not depend on the power ranks of the transformations. This study has application in econometrics and financial data analysis when the time series observations have non-Gaussian heavy tails.
  • 20 April 2018
    R Markdown
    Thu Nguyen
     
    Abstract: R Markdown provides an authoring framework for data science and statistics. You can use a single R Markdown file to save and execute code, generate high quality reports that can be shared with an audience. R Markdown support multiple languages including R, Python, and SQL, as well as dozens of static and dynamic output formats including HTML, PDF, MS Word, Beamer, HTML5 slides, Tufte-style handouts, books, dashboards, shiny applications, scientific articles, websites, and more.