Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
Statistical Learning
Time:
Friday, 15/Mar/2024:
9:00am - 10:15am

Session Chair: Johannes Lederer
Location: Theatre Hall (Delft X)

Building 37 Mekelweg 8 NL-2628 CD Delft

Show help for 'Increase or decrease the abstract text size'
Presentations
9:00am - 9:25am

A Wasserstein perspective of Vanilla GANs

Lea Kunkel, Mathias Trabs

Karlsruhe Institute of Technologie, Germany

The empirical success of Generative Adversarial Networks (GANs) caused an increasing interest in theoretical research. The statistical literature is mainly focused on Wasserstein GANs and generalizations thereof, which especially allow for good dimension reduction properties.Statistical results for Vanilla GANs, the original optimization problem, are still rather limited and require assumptions such as smooth activation functions and equal dimensions of the latent space and the ambient space. To bridge this gap, we draw a connection from Vanilla GANs to the Wasserstein distance. By doing so, problems caused by the Jensen-Shannon divergence can be avoided and existing results for Wasserstein GANs can be extended to Vanilla GANs. In particular, we obtain an oracle inequality for Vanilla GANs in Wasserstein distance. The assumptions of this oracle inequality are designed to be satisfied by network architectures commonly used in practice, such as feedforward ReLU networks. Using Hölder-continuous ReLU networks we conclude a rate of convergence for estimating probability distributions.



9:25am - 9:50am

Asymptotic Theory for Constant Step Size Stochastic Gradient Descent

Jiaqi Li1, Zhipeng Lou2, Stefan Richter3, Wei Biao Wu4

1Washington University in St. Louis; 2University of Pittsburgh; 3Heidelberg University; 4University of Chicago

We investigate the statistical behavior of Stochastic Gradient Descent (SGD) with constant step size under the framework of iterated random functions. Unlike previous studies establishing the convergence of SGD in probability measure, e.g., Wasserstein distance, our approach provides the convergence in Euclidean distance by showing the Geometric Moment Contraction (GMC) of SGD. This new convergence can address the non-stationarity of SGD due to fixed initial points and can provide a more refined asymptotic analysis of SGD. Specifically, we prove a quenched central limit theorem and a quenched invariance principle for averaged SGD (ASGD) regardless of the initial points. Furthermore, we provide a novel perspective to understand the impact of step sizes in SGD by studying its derivative with respect to the step size. The existence of stationary solutions for the first and second derivative processes are shown under mild conditions. Subsequently, we utilize multiple step sizes and show an enhanced Richardson-Romberg extrapolation with improved bias representation, which brings ASGD estimates closer to the global optimum. Finally, we propose a new online inference method and a bias-reduced variant for the extrapolated ASGD. Empirical confidence intervals are constructed and the coverage probabilities are shown to be asymptotically correct by numerical experiments.



9:50am - 10:15am

A Continuous-time Stochastic Gradient Descent Method for Continuous Data

Kexin Jin1, Jonas Latz2, Chenguang Liu3, Carola-Bibiane Schönlieb4

1Princeton University; 2University of Manchester; 3Tu Delft; 4University of Cambridge

Optimization problems with continuous data appear in, e.g., robust machine learning, functional data analysis, and variational inference. Here, the target function is an integral over a family of (continuously) indexed target functions—integrated concerning a probability measure. Such problems can often be solved by stochastic optimization methods: performing optimization steps for the indexed target function with randomly switched indices. In this talk, we will discuss a continuous-time variant of the stochastic gradient descent algorithm for optimization problems with continuous data. The stochastic gradient process consists of a gradient flow minimizing an indexed target function coupled with a continuous-time process determining the index. Index processes are, e.g., reflected diffusions, and pure jump processes in compact spaces. Thus, we study multiple sampling patterns for the continuous data space and allow for data simulated or streamed at runtime of the algorithm. We analyze the approximation properties of the stochastic gradient process and study its long-time behavior and ergodicity under constant and decreasing learning rates.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: SMSA 2024
Conference Software: ConfTool Pro 2.8.103+CC
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany