While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that th ...
In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...
In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...
Distributed learning is the key for enabling training of modern large-scale machine learning models, through parallelising the learning process. Collaborative learning is essential for learning from privacy-sensitive data that is distributed across various ...
Modern optimization is tasked with handling applications of increasingly large scale, chiefly due to the massive amounts of widely available data and the ever-growing reach of Machine Learning. Consequently, this area of research is under steady pressure t ...
Quantum support vector machines employ quantum circuits to define the kernel function. It has been shown that this approach offers a provable exponential speedup compared to any known classical algorithm for certain data sets. The training of such models c ...
Verein Forderung Open Access Publizierens Quantenwissenschaf2024
In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of 2-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...
Within the context of contemporary machine learning problems, efficiency of optimization process depends on the properties of the model and the nature of the data available, which poses a significant problem as the complexity of either increases ad infinit ...
Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large- scale semidefinite programs (SDPs), and various other tasks. As the perfor ...
The monumental progress in the development of machine learning models has led to a plethora of applications with transformative effects in engineering and science. This has also turned the attention of the research community towards the pursuit of construc ...
Control systems operating in real-world environments often face disturbances arising from measurement noise and model mismatch. These factors can significantly impact the perfor- mance and safety of the system. In this thesis, we aim to leverage data to de ...
In this article, we study the problem of Byzantine fault-tolerance in a federated optimization setting, where there is a group of agents communicating with a centralized coordinator. We allow up to f Byzantine-faulty agents, which may not follow a prescr ...
We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this an ...
Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrain ...
The variational approach is a cornerstone of computational physics, considering both conventional and quantum computing computational platforms. The variational quantum eigensolver algorithm aims to prepare the ground state of a Hamiltonian exploiting para ...
A multi-agent system consists of a collection of decision-making or learning agents subjected to streaming observations from some real-world phenomenon. The goal of the system is to solve some global learning or optimization problem in a distributed or dec ...
We propose ordering-based approaches for learning the maximal ancestral graph (MAG) of a structural equation model (SEM) up to its Markov equivalence class (MEC) in the presence of unobserved variables. Existing ordering-based methods in the literature rec ...
Association for the Advancement of Artificial Intelligence (AAAI)2023
In this paper, we present a spatial branch and bound algorithm to tackle the continuous pricing problem, where demand is captured by an advanced discrete choice model (DCM). Advanced DCMs, like mixed logit or latent class models, are capable of modeling de ...
Deep neural networks have become ubiquitous in today's technological landscape, finding their way in a vast array of applications. Deep supervised learning, which relies on large labeled datasets, has been particularly successful in areas such as image cla ...
Neural machine translation (MT) and text generation have recently reached very high levels of quality. However, both areas share a problem: in order to reach these levels, they require massive amounts of data. When this is not present, they lack generaliza ...