Optimal Trade-Offs in Distributed Optimization for Large-Scale Learning


At a glance

For many tasks critical to autonomous driving, such as object detection and image segmentation from video images, the most successful current approaches involve learning rich prediction rules from large datasets. As the size of these datasets and the complexity of these prediction rules increases, the time spent solving the optimization problems associated with training dominates the development time in these applications. There is a significant challenge in finding efficient methods for parameter optimization that can effectively exploit the availability of multiple processors.

The aims of this research project are to understand the trade-offs between design choices in distributed optimization methods for large-scale convex and non-convex optimization problems (such as those that arise in autonomous driving applications), to develop algorithms that can achieve the best possible tradeoffs, and to develop roof-of-concept implementations of these algorithms.

In this project, we will investigate theoretically the performance of methods for distributed optimization. Our aim is to quantify the relationships between computation, communication, and statistical performance. In particular, we wish to elucidate the optimal trade-offs, and to develop algorithms that achieve this optimal performance. We will mainly investigate a variety of distributed stochastic gradient algorithms, explore the performance guarantees, and develop lower bounds that demonstrate the limits on these trade-offs. We expect that information theoretic tools will be useful here for understanding both communication complexity and convergence rates.

principal investigatorsresearchersthemes
Peter Bartlett
Kannan Ramchandran
 Machine Learning
Distributed Optimization