Publication

Almost surely constrained convex optimization

Publications associées (32)

On the Generalization of Stochastic Gradient Descent with Momentum

While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that th ...

Microtome Publishing2024

Understanding generalization and robustness in modern deep learning

Maksym Andriushchenko

In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...

EPFL2024

Topics in statistical physics of high-dimensional machine learning

Hugo Chao Cui

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...

EPFL2024

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Anastasiia Koloskova

Distributed learning is the key for enabling training of modern large-scale machine learning models, through parallelising the learning process. Collaborative learning is essential for learning from privacy-sensitive data that is distributed across various ...

EPFL2024

Scalable constrained optimization

Maria-Luiza Vladarean

Modern optimization is tasked with handling applications of increasingly large scale, chiefly due to the massive amounts of widely available data and the ever-growing reach of Machine Learning. Consequently, this area of research is under steady pressure t ...

EPFL2024

The complexity of quantum support vector machines

Stefan Woerner, Gian Florin Gentinetta

Quantum support vector machines employ quantum circuits to define the kernel function. It has been shown that this approach offers a provable exponential speedup compared to any known classical algorithm for certain data sets. The training of such models c ...

Verein Forderung Open Access Publizierens Quantenwissenschaf2024

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Scott William Pesme

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Universal and adaptive methods for robust stochastic optimization

Ali Kavis

Within the context of contemporary machine learning problems, efficiency of optimization process depends on the properties of the model and the nature of the data available, which poses a significant problem as the complexity of either increases ad infinit ...

EPFL2023

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Mehmet Fatih Sahin

Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large- scale semidefinite programs (SDPs), and various other tasks. As the perfor ...

EPFL2023

Probabilistic methods for neural combinatorial optimization

Nikolaos Karalias

The monumental progress in the development of machine learning models has led to a plethora of applications with transformative effects in engineering and science. This has also turned the attention of the research community towards the pursuit of construc ...

EPFL2023

Data-Driven Control and Optimization under Noisy and Uncertain Conditions

Baiwei Guo

Control systems operating in real-world environments often face disturbances arising from measurement noise and model mismatch. These factors can significantly impact the perfor- mance and safety of the system. In this thesis, we aim to leverage data to de ...

EPFL2023

Byzantine Fault-Tolerance in Federated Local SGD Under 2f-Redundancy

Nirupam Gupta

In this article, we study the problem of Byzantine fault-tolerance in a federated optimization setting, where there is a group of agents communicating with a centralized coordinator. We allow up to

f

Byzantine-faulty agents, which may not follow a prescr ...

Ieee-Inst Electrical Electronics Engineers Inc2023

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Daniel Kuhn, Yves Rychener, Tobias Sutter

We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this an ...

2023

The statistical complexity of early-stopped mirror descent

Varun Kanade, Tomas Vaskevicius

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrain ...

Oxford Univ Press2023

Phenomenological theory of variational quantum ground-state preparation

Ivano Tavernelli, Giuseppe Carleo

The variational approach is a cornerstone of computational physics, considering both conventional and quantum computing computational platforms. The variational quantum eigensolver algorithm aims to prepare the ground state of a Hamiltonian exploiting para ...

Amer Physical Soc2023

Multi-agent Learning with Privacy Guarantees

Elsa Rizk

A multi-agent system consists of a collection of decision-making or learning agents subjected to streaming observations from some real-world phenomenon. The goal of the system is to solve some global learning or optimization problem in a distributed or dec ...

EPFL2023

Novel Ordering-based Approaches for Causal Structure Learning in the Presence of Unobserved Variables

We propose ordering-based approaches for learning the maximal ancestral graph (MAG) of a structural equation model (SEM) up to its Markov equivalence class (MEC) in the presence of unobserved variables. Existing ordering-based methods in the literature rec ...

Association for the Advancement of Artificial Intelligence (AAAI)2023

A Spatial Branch and Bound Algorithm for Continuous Pricing with Advanced Discrete Choice Demand Modeling

Michel Bierlaire

In this paper, we present a spatial branch and bound algorithm to tackle the continuous pricing problem, where demand is captured by an advanced discrete choice model (DCM). Advanced DCMs, like mixed logit or latent class models, are capable of modeling de ...

2023

Deep Learning Generalization with Limited and Noisy Labels

Mahsa Forouzesh

Deep neural networks have become ubiquitous in today's technological landscape, finding their way in a vast array of applications. Deep supervised learning, which relies on large labeled datasets, has been particularly successful in areas such as image cla ...

EPFL2023

Regularization Techniques for Low-Resource Machine Translation

Alejandro Ramírez Atrio

Neural machine translation (MT) and text generation have recently reached very high levels of quality. However, both areas share a problem: in order to reach these levels, they require massive amounts of data. When this is not present, they lack generaliza ...

EPFL2023