Table of Contents
The “Deep Learning” reading-group is located in the open area of the Center for Data Science of NYU. See directions below. Please make sure to bring an ID and sign up at the front desk.
The talk is at 3:00pm, in the open area of CDS; see instructions above.
Spin-glass energy landscapes: theoretical results
We will cover the following paper:
Random Matrices and complexity of Spin Glasses
The talk is located at 719 Broadway (intersection with Washington Place), 12th floor, in the large conference room. The talk is at 3:00pm.
Scalable Bayesian Optimization Using Deep Neural Networks
Bayesian optimization has been demonstrated as an effective methodology for the global optimization of functions with expensive evaluations. Its strategy relies on querying a distribution over functions defined by a relatively cheap surrogate model. The ability to accurately model this distribution over functions is critical to the effectiveness of Bayesian optimization, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires a large number of evaluations, and as such, massively parallelizing the optimization.
In this talk, I will discuss recent work with my collaborators using neural networks as an alternative to Gaussian processes to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we use to rapidly search over large spaces of models. We achieve state-of-the-art results on benchmark object recognition tasks using convolutional neural networks, and image caption generation using multimodal neural language models.
Smart Cars for Safe Pedestrians
One of the most significant large-scale deployments of intelligent systems in our daily life nowadays involves driver assistance in smart cars. Accident statistics show that roughly one quarter of all traffic fatalities world-wide involve vulnerable road users (pedestrians, bicyclists); most accidents occur in an urban setting. Devising an effective driver assistance system for vulnerable road users has long been impeded, however, by the “perception bottleneck”, i.e. not being able to detect and localize vulnerable road users sufficiently accurate. The problem is challenging due to the large variation in object appearance, the dynamic and cluttered urban backgrounds, and the potentially irregular object motion. Topping these off are stringent performance criteria and real-time constraints. I give an overview of the remarkable computer vision progress that has been achieved in this area and discuss the main enablers: the algorithms, the data, the hardware and the tests. Daimler has recently introduced an advanced set of driver assistance functions in its Mercedes-Benz 2013-2014 S-, E-, and C-Class models, termed “Intelligent Drive”, using stereo vision. It includes a pedestrian safety component which facilitates fully automatic emergency braking - the system works day and night. I discuss “Intelligent Drive” and future research directions, on the road towards accident-free driving. Bio
Dariu M. Gavrila received the PhD degree in computer science from the University of Maryland at College Park, USA, in 1996. Since 1997, he has been with Daimler R&D in Ulm, Germany, where he is currently a Principal Scientist. In 2003, he was further appointed professor at the University of Amsterdam, chairing the area of Intelligent Perception Systems (part time). Over the past 15 years, Prof. Gavrila has focused on visual systems for detecting humans and their activity, with application to intelligent vehicles, smart surveillance and social robotics. He led the multi-year pedestrian detection research effort at Daimler, which materialized in the Mercedes-Benz S-, E-, and C-Class models (2013-2014). He is frequently cited in the scientific literature and he received the I/O 2007 Award from the Netherlands Organization for Scientific Research (NWO) as well as several conference paper awards. His personal Web site is www.gavrila.net.
Part I: Transformation Pursuit for Image Classification
In part I, I present a simple and efficient algorithm – Image Transformation Pursuit (ITP) – that performs automatic selection of relevant transformations for virtual example generation in order to enforce transformation-invariance in visual recognition architectures. We report impressive performance gains on two public visual recognition benchmarks : the CUB dataset of bird images, and the ImageNet2010 challenge dataset.
In part II, I present a new approach for large-scale displacement optical flow estimation, called DeepFlow. The approach blends a matching algorithm, DeepMatching, inspired by convolutional nets, with variational energy minimization. The resulting algorithm shows competitive performance on optical ﬂow public benchmarks, and sets a new state-of-the-art on the MPI-Sintel benchmark dataset.
Title: Efficient training of structured SVMs via soft constraints.
Abstract: Structured output prediction is a powerful framework for jointly predicting interdependent output labels. Learning the parameters of structured predictors is a central task in machine learning applications. However, training the model from data often becomes computationally expensive. Several methods have been proposed to exploit the model structure, or decomposition, in order to obtain efficient training algorithms. In particular, methods based on linear programming relaxation, or dual decomposition, decompose the prediction task into multiple simpler prediction tasks and enforce agreement between overlapping predictions. In this work we observe that relaxing these agreement constraints and replacing them with soft constraints yields a much easier optimization problem. Based on this insight we propose an alternative training objective, analyze its theoretical properties, and derive an algorithm for its optimization. Our method, based on the Frank-Wolfe algorithm, achieves significant speedups over existing state-of-the-art methods without hurting prediction accuracy.