Comité local d’organisation
- Jérôme Bolte (UT1 et TSE)
- Sonia Cafieri (ENAC)
- Olivier Cots (INP-ENSEEIHT et IRIT)
- Victor Magron (LAAS-CNRS)
- Pierre Maréchal (UPS et IMT)
- Edouard Pauwels (UPS et IRIT)
- Aude Rondepierre (INSA et IMT)
- Emmanuel Soubies (IRIT et CNRS)
SPOT 83 - Lundi 12 juin 2023 (Amphi A002 de l’ENSEEIHT)
14h – Zheng Qu (University of Hong Kong)
15h – Hippolyte Labarrière (INSA Toulouse)
SPOT 82 - Lundi 22 mai 2023 (salle des thèses de l’ENSEEIHT)
Obtaining guarantees on the convergence of the minimizers of empirical risks to the ones of the true risk is a fundamental matter in statistical learning. Instead of deriving guarantees on the usual estimation error, we will explore concentration inequalities on the distance between the sets of minimizers of the risks. We will argue that for a broad spectrum of estimation problems, there exists a regime where optimal concentration rates can be proven. The bounds will be showcased on a selection of estimation problems such as barycenters on metric space with positive or negative curvature, subspaces of covariance matrices.
SPOT 81 – Lundi 3 avril 2023 (salle des thèses de l’ENSEEIHT)
14h – Samuel Vaiter (Université Côte D’Azur)
A very biased introduction to bilevel optimization in machine learning
Bilevel optimization (i.e., the problem of minimizing a value function which involve the arg-minimum of another function), appears in many areas of machine learning.
I will highlight some current ideas on how to solve them in the context of large-scale empirical risk minimisation, how to prove their convergence rates and how do they perform on practical applications. Joint works with P. Ablin / M. Dagréou / T. Moreau
15h – Mathieu Serrurier (Institut de Recherche en Informatique de Toulouse)
Building and training 1-Lipschitz Neural Networks by using Optimal Transport
The lack of robustness and explainability in neural networks is directly linked to the arbitrarily
high Lipschitz constant of deep models. Although constraining the Lipschitz constant has been shown to improve these properties, it can make it challenging to learn with classical loss functions. In this presentation, we explain how to control this constant, and demonstrate that training such networks requires defining specific loss functions and optimization processes. To this end, we propose a loss function based on optimal transport that not only certifies robustness but also converts adversarial examples into provable counterfactual examples.
SPOT 80 – Lundi 6 Mars 2023 (salle des thèses de l’ENSEEIHT)
14h – Henrique Goulart (INP-ENSEEIHT)
15h – Mathurin Massias (Inria Lyon)
SPOT 79 – Lundi 6 Février 2023 (salle des thèses de l’ENSEEIHT)
14h – Luca Calatroni (Laboratoire I3S, Sophia Antipolis)
Scaled, inexact and adaptive FISTA for (strongly) convex optimisation
We design an inexact, scaled and generalised Fast Iterative Soft-Thresholding Algorithm (FISTA) for minimising the sum of two (possibly strongly) convex functions. Inexactness is explicitly taken into account for describing situations where proximal operators cannot be evaluated in closed form. The use of data-dependent scaling allows to incorporate Newton-type information along the optimisation via suitable variable-metric updates. Finally, non-monotone backtracking is used to improve convergence speed. Linear convergence for the function values is proved with rates depending on backtracking/scaling and strong convexity parameters. The performance of the proposed algorithm, named SAGE-FISTA is validated on exemplar imaging problems where sparsity-promoting (l_1, TV) regularisation is combined with standard data-fidelity terms. The results show improved performance and efficiency under limited computational budget.
15h – Igor Klep (University of Ljubljana)
Noncommutative polynomial optimization: a tutorial
Optimization problems involving polynomial data arise across many areas of modern science and engineering. Due to recent advances, such as the Lasserre sum of squares hierarchy, these can nowadays be efficiently solved using semidefinite programming (SDP).
Several areas of science regularly optimize functions involving polynomial data in noncommuting variables, such as matrices or operators. For instance, control theory, quantum information theory or quantum chemistry.
The talk will present, in a self-contained style with lots of examples, some first steps in noncommutative polynomial optimization. Topics covered will include sums of hermitian squares, Positivstellensätze, and convergent SDP hierarchies.
SPOT 78 – Lundi 9 Janvier 2023 (salle des thèses de l’ENSEEIHT)
14h – Jean-Baptiste Hiriart-Urruty (IMT, Univ. Paul Sabatier Toulouse)
More on second-order differentiability properties of the Moreau regularization-approximation of a convex function
In this work, which constitutes the matter of a paper submitted for publication in Optimization Methods and Software, we revisit and improve second-order (classical) differentiability properties of the so-called Moreau regularized version of a (extended-valued) convex function. We delineate what to expect, what can be obtained. Our results are sharp, as several examples show it. See also abstract-jbhu-09-jan-23.
15h – Pierre Boudier (Lead of the computer hardware and software architecture at Intel)
Accélération de l’algèbre linéaire en multi précision pour le machine learning
Le machine learning a connu un développement important depuis 10 ans, notamment grâce à la disponibilité de hardware pour accélérer les nombreux calculs d’algèbre linéaire. Dans cette présentation, nous aurons un aperçu du fonctionnement d’un GPU, et comment la multiplication de matrice peut être implémentée avec différents niveaux de précision. Les techniques classiques de localisation dans les caches, hardware multi threading, SIMD ou SIMT units, systolic array, prefetching seront abordées.
SPOT 77 – Lundi 28 Novembre 2022 (salle des thèses de l’ENSEEIHT)
Organisé par Cédric Févotte (IRIT-CNRS Toulouse) en coordination avec le comité SPOT.
15h – Rémi Gribonval (INRIA & ENS Lyon)
Rapture of the deep: highs and lows of sparsity in a world of depths
Attempting to promote sparse connections in neural networks is natural to control their computational complexity. Besides, given its thoroughly documented role in inverse problems and variable selection, sparsity also has the potential to give rise to learning mechanisms endowed with certain interpretability guarantees. Through an overview of recent explorations around this theme, I will compare and contrast classical sparse regularization for inverse problems with multilayer sparse regularization. During our journey I will highlight the potential of an invariant path-embedding of the parameters of a deep network, both to learn the parameters and to analyze their identifiability from the function implemented by the network. In the process, we will be remembered that there is life beyond gradient descent, as illustrated by an algorithm that brings speedups of up to two orders of magnitude when learning certain fast transforms via multilayer sparse factorization.
16h – Matthieu Kowalski (Univ. Paris-Sud)
Use a Lasso to grab Champagne by looking at the MAP with Bernoulli and Gauss
Sparse coding is now one of the state-of-art approaches for solving inverse problems. We discuss two Maximum a Posteriori (MAP) interpretations for state-of-the-art methods used in Sparse Inverse Problems: the joint-MAP and the Marginal-MAP. Canonically rooted in a Bayesian framework, sparsity is modeled by a general Spike and Slab distribution. The focus is on the solution’s support recovery rather than the signal amplitude itself. We study the prominent Bernoulli-Gaussian model and show that a well-re-parametrized joint-MAP may lead to nice local minimizers of the marginal-MAP. Additionally, we explore common convex relaxations of the support onto a continuous space and encompass them under the scope of a parametrized distribution. Upon describing the behavior of a few said relaxations, equivalences are established between the Bernoulli-Gaussian joint-MAP, marginal-MAP, and well-studied methods such as the Lasso and Sparse Bayesian Learning. The given links allow revisiting the CHAMPAGNE procedure as a Reweighted Lasso procedure. In the second part, we will discuss one of the major drawbacks of these methods, which is the tuning of the so-called hyperparameter. We provide an Expectation-Maximization (EM) algorithm to estimate the parameters of the Bernoulli-Gaussian model for denoising a sparse signal corrupted by white Gaussian noise. Then, building on the Expectation-Maximization interpretation of the so-called Iterative Shrinkage/Thresholding Algorithm (ISTA), we provide a simple iterative algorithm to blindly estimate all the model parameters in the linear inverse problem context.
SPOT 76 – Lundi 14 Novembre 2022 (salle des thèses de l’ENSEEIHT)
14h – Jérôme Bolte (TSE et Univ. Toulouse Capitole)
A Glance at Nonsmooth Automatic differentiation
Nonsmooth automatic differentiation is one of the core learning mechanism in modern AI. I will show how a recent theory we developed « Conservative gradients » helps to understand this process and related fundamental phenomena, such as the convergence of learning phases in deep learning, the optimization of learning parameters, the nonsmooth cheap gradient principle, or the differentiation of algorithms. Joint work with E. Pauwels.
15h – Didier Henrion (LAAS-CNRS Toulouse and Czech Tech. Univ. Prague)
The Moment-SOS Hierarchy
The polynomial optimization problem (POP) is a very general problem which seeks to minimize a polynomial of many real variables subject to polynomial inequalities. Its special case is the problem of finding real solutions of a system of polynomial equalities and inequalities. This NP-hard problem has many applications in fields such as statistics, signal processing, machine learning, computer vision, computational geometry, and control engineering. The Moment-SOS hierarchy is an approach to the POP that allows us to solve it globally at the price of solving a family of convex (semidefinite) optimization problems of increasing size. The lecture will introduce the approach, describe its main milestones during the last two decades, as well as applications in statistics, signal processing and control. The focus will be on the computational features of the Moment-SOS hierarchy, its limitations and current efforts to overcome them. Mostly joint work with J. B. Lasserre.