Workshop-Avril16

Working day: Hypothesis testing and mixtures

5 avril 2016

Institut de Mathématiques de Toulouse, Amphi Schwartz (1R3)

Une journée d’exposés sur le thème « tests d’hypothèse et modèles de mélange » est organisée le mardi 5 avril 2016 à l’Institut de Mathématiques de Toulouse dans le cadre du projet ANR MixStatSeq.

Orateurs :

Organisateurs : 
Béatrice Laurent
Clément Marteau
Cathy Maugis-Rabusseau

Programme :
9h00-9h30: Accueil

9h30-10h30: Exposé de Cristina Butucea

Titre : Mixture models with symmetric errors
Résumé : A semiparametric mixture of two populations with the same probability density and different locations can be identified and estimated under the assumption
that the common probability density is symmetric. We use identifiability results
in Bordes et al. (2006) and propose a new estimation algorithm based on techniques
from the theory of inverse problems.
We consider next a semiparametric mixture of regression models and study its
identifiability. We propose an estimation procedure of the mixing proportion
and of the location functions locally at a fixed point. Our estimation procedure
is based on the symmetry of the errors’ distribution and does not require finite
moments on the errors. We establish under mild conditions minimax properties and
asymptotic normality of our estimators. We study the finite sample performance
on synthetic data and on the positron emission tomography imaging data in a cancer
study in Bowen et al. (2012).

10h30-10h50: Pause café

10h50-11h50: Exposé de Judith Rousseau  [ Slides -Rousseau  ]

Titre: Testing hypotheses via a mixture estimation model
Résumé: We consider a novel paradigm for Bayesian testing of hypotheses and Bayesian model comparison. Our alternative to the traditional construction of posterior probabilities that a given hypothesis is true or that the data originates from a specific model is to consider the models under comparison as components of a mixture model. We therefore replace the original testing problem with an estimation one that focus on the probability weight of a given model within a mixture model. We analyze the sensitivity on the resulting posterior distribution on the weights of various prior modeling on the weights. We stress that a major appeal in using this novel perspective is that generic improper priors are acceptable, while not putting convergence in jeopardy. Among other features, this allows for a resolution of the Lindley-Jeffreys paradox. When using a reference Beta B(a,a) prior on the mixture weights, we note that the sensitivity of the posterior estimations of the weights to the choice of a vanishes with the sample size increasing and avocate the default choice a=0.5, derived from Rousseau and Mengersen (2011). Another feature of this easily implemented alternative to the classical Bayesian solution is that the speeds of convergence of the posterior mean of the weight and of the
corresponding posterior probability are quite similar.

12h-14h : Repas

14h-15h: Exposé de Nicolas Verzelen  [  Slides-Verzelen ]

Titre : Détection de modèle de mélange gaussien parcimonieux et classification non supervisée
Résumé: L’objectif de cet exposé est de comparer la difficulté de deux problèmes statistiques: (i) classification supervisée (ii) classification non supervisée. Pour ce faire, on adoptera une approche « model-based » en considérant des modèles de mélange gaussien en grande dimension. Après une revue sélective de littérature sur les modèles de
mélange, on s’intéressera aux vitesses optimales de classification, ce afin de caractériser les régimes dans lesquelles une classification consistante est réalisable.

15h00-15h20 : Pause café

15h20-16h20 : Exposé de Clément Marteau  [ Slides-Marteau]

Titre: Multidimensional two component Gaussian mixtures detection.
Résumé: Let (X_1,,X_n) be a d-dimensional i.i.d sample from a distribution with density f. The problem of detection of a two-component mixture is considered. Our aim is to decide whether f is the density of a standard Gaussian random d-vector (f=ϕ_d) against f is a two-component mixture: f=(1ε)ϕ_d+εϕ_d(.μ) where (ε,μ) are unknown parameters. Optimal separation conditions on ε,μ,nand the dimension d are established, allowing to separate both hypotheses with prescribed errors. Several testing procedures are proposed and two alternative subsets are considered.

Inscription :

L’inscription est gratuite. Si vous souhaitez participer à cette journée d’exposés, merci d’envoyer un mail à l’adresse cathy.maugis-at-insa-toulouse.fr avec le sujet « Inscription 5 avril » (en précisant si vous prenez le déjeuner).

Pour venir à l’IMT :

Vous trouverez toutes les informations pour venir à l’IMT ici.