NeurIPS 2019 - A recap
Fanny Riols Fanny Riols
January 14 10 min

NeurIPS 2019 - A recap

Introduction

This year, NeurIPS went big! There were 13,000 people registered. It happened at the Vancouver Convention Center. It lasted 7 days, including an industry day (Expo), 9 tutorials, 51 workshops and 4 affinity workshops, 200 paper presentations, 1428 posters, 16 competitions, 28 demonstrations, 15 social events, 79 meetups and networking.

We won’t be able to talk about everything in this article, but let’s dig deeper into some of the topics that were presented and what is trending lately.

Diversity, Inclusion and Well-Being

There were 4 affinity workshops: Black in AI (sponsored by EAI), Women in ML (sponsored by EAI), LatinX in AI and Queer in AI. Those are technical workshops that promote underrepresented minorities. Everybody interested in machine learning is welcome to attend these workshops. It is an opportunity to learn, meet, network and exchange ideas.

Even though Queer in AI speakers’ names were publicly available online, Queer in AI organizers gave red and blue stickers to poster presenters and attendees to indicate if they were comfortable being photographed or prefered to keep their identity private.

Queer in AI identity protection options.

Unfortunately, as last year, too many researchers had their visas denied and were not able to attend the conference. This year, NeurIPS organisers reached out to the Canadian government and tried to make the visa process smoother, providing a list of names of people invited to attend the conference, but it wasn’t enough. This issue is reducing the number of African voices at the conference by a significant proportion. More than a third of the people invited to attend Black in AI workshop from abroad have been denied their necessary travel documents.

There were also many social events that happened throughout the week, and some were encouraging minorities to attend, such as {Dis}Ability in AI and Women in AI Ignite.

Finally, a social event Element AI sponsored and co-organized was “Well-Being in ML”. It was an opportunity to make our community mindful of positive well-being practices at the very conference that epitomises its science. Its end goal is to provide the community with tools to maintain their well-being and mental health, and encourage them to be outspoken on these issues. More than 200 people participated!

Well-being in ML

Fairness, Interpretability, Explainability and Ethics

Regulators and legislators, and more broadly the general public, do not understand all the subtle nuances of AI. Even though having several standard organizations, AI ethics frameworks and definition of fairness might seem reassuring, it is in fact even more complicated for the general public to get a good understanding of what AI is and what are the risks of having AI systems. One of the first steps to build trust is to be able to explain your AI models. This is why the Fairness, Interpretability, Explainability and Ethics research areas matter so much and are becoming more common, such as in those workshops: Minding the Gap: Between Fairness and Ethics, Workshop on Human-Centric Machine Learning and Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy.

Interested papers on the subjects:

Also, Facebook released a model interpretability and understanding library for PyTorch, Captum. This library offers attribution algorithms to interpret AI models.

Captum insights.

Data Privacy, Federated Learning and Encryption

Data Privacy, in particular Federated Learning (Facebook/pytorch, Apple, doc.ai) and Differential privacy

Standard ML approaches require centralizing the training data on a machine or in a datacenter. However, privacy and security have become critical concerns in recent years, particularly as companies and organizations increasingly collect detailed information about their products and users. Federated Learning (FL) is an ML approach that enables edge devices or servers holding local data samples to collaboratively learn a shared prediction model while keeping all the training data locally, decoupling the ability to do ML from the need to centralize the training data.

FL is an ML setting where many clients collaboratively train a model under the orchestration of a central server, while keeping the training data decentralized. The common technical issues are general computation based on decentralized data and how such computations can be combined with other research areas, such as differential privacy, secure multi-party computation, computational efficiency, coding theory, etc.

Several papers and one workshop were dedicated to that field:

Also, PyTorch has a framework for Privacy Preserving Machine Learning called CrypTen.

Reinforcement Learning

EAI researchers published two papers on reinforcement learning. One of them is Learning Reward Machines for Partially Observable Reinforcement Learning, in which the authors show that you the user doesn’t have to specify the Reward Machines anymore: it can be learned from experience. In the other paper Real-Time Reinforcement Learning, a new framework is being introduced, in which states and actions evolve simultaneously and show how it is related to the classical MDP formulation. This method outperforms the existing state-of-the-art continuous control algorithm Soft Actor-Critic both in real-time and non-real-time settings.

Successor Representation (SR) was originally introduced as a representation defining state generalization by the similarity of successor states. It was introduced in 1993 by Dayan. However, due to the relation to multi-task learning, it is currently further explored. Two interesting papers on the topic are A neurally plausible model learns successor representations in partially observable environments and Better Transfer Learning with Inferred Successor Maps.

On a more applied side, another challenge in RL is to correctly assign credit for the reward received to earlier behaviour. In Hindsight Credit Assignment, the authors are directly trying to model the probability that a reward was received as a probability distribution that is estimated from data.

Some papers have been published combining the strengths of Planning and RL methods to effectively solve long horizons, sparse reward tasks with high-dimensional observations. In Search on the Replay Buffer: Bridging Planning and Reinforcement Learning, the authors use the difference between the Q-values of states stored in a buffer as an estimate of the distance between them, and then do a graph search to come up with a plan. In the figure below, they are doing planning over images for visual navigation.

Visual navigation.
Visual Navigation: Given an initial state and goal state, their method automatically finds a sequence of intermediate waypoints. The agent then follows those waypoints to reach the goal.

Vision

Concerning the Vision research area, there were interesting advancements in object recognition. In Brain-like object recognition with high-performing shallow recurrent ANNs paper, they demonstrate that better anatomical alignment to the brain and high performance on ML as well as neuroscience measures do not have to be in contradiction. They propose a BrainScore benchmark which assesses correspondence of CNN representations against monkey visual system recordings. In Unsupervised learning of object keypoints for perception and control paper, they propose a method that is able to transport visual features from frame-to-frame using a keypoint bottleneck. With this approach, the authors are able to track objects much more efficiently. Also, a large-scale bias-controlled dataset for pushing the limits of object recognition models, called ObjectNet, has been released. It is a challenging dataset for object classification, with images from the real world, with many different objects from new viewpoints on new backgrounds.

One track EAI researchers have been working on is “few-shot learning”. In Adaptive Cross-Modal Few-Shot Learning, they leverage cross-modal information to enhance metric-based few-shot learning methods with a mechanism that can adaptively combine information from both modalities according to new image categories to be learned. The improvement in performance is particularly large when the number of shots is very small.

Another Vision research track that got few papers was about image segmentation. A Region Mutual Information Loss for Semantic Segmentation has been developed to model the dependencies among pixels more simply and efficiently. This method uses one pixel and its neighbours to represent this pixel. The authors of this paper achieved great improvements in performance on PASCAL VOC 2012 and CamVid datasets. The code is available here. In Neural Diffusion Distance for Image Segmentation paper, the authors propose a differentiable deep architecture consisting of feature extraction and diffusion distance modules for computing diffusion distance on image by end-to-end training. Then, with the learned diffusion distance, they propose a hierarchical image segmentation method outperforming previous segmentation methods.

Hierarchical image segmentation.
Example of the hierarchical image segmentation results.

There were other interesting papers on reconstructing 3D representations from 2D images, making image feature descriptors invariant to scaling and rotation transformations, reconstruct a scene from a given viewpoint, and more. Also, a new dataset for document intelligence: CORD: A Consolidated Receipt Dataset for Post-OCR Parsing.

Time-Dependent Data

The well-known challenge with time-dependent data is forecasting. In Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting, the authors try to better incorporate the local context into attention mechanism with a convolutional self-attention and improve forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget using a LogSparse Transformer with only O(L(logL)2) memory cost.


A field of interest in time series is the uncertainty quantification. In Single-Model Uncertainties for Deep Learning, a modification of the pinball loss function to estimate aleatoric uncertainty for all quantiles simultaneously has been introduced as well as a method training a classifier on top of a pre-trained model to estimate epistemic uncertainty. In Accurate Uncertainty Estimation and Decomposition in Ensemble Learning, the authors model all different sources of uncertainty using a Bayesian nonparametric ensemble (BME), which can be used to design distributional forecasts of ensemble models. Finally in Uncertainty on Asynchronous Time Event Prediction, the task of predicting the next event is tackled. Two new architectures are presented, in which the combination of RNNs with either Gaussian process or function decomposition allows to express rich temporal evolution of the distribution parameters, and naturally captures uncertainty. Their model framework is illustrated in the figure below.

Time dependent data
The model framework: given a new sequence of events s, the model generates pseudo points that describe the temporal evolution of the distribution on the simplex and provide a measure of uncertainty in their predictions.

Coming years!

In 2020, NeurIPS will also take place in Vancouver, Canada. The destination for 2021 has been announced: Sydney, Australia.