2020 AI Research Highlights: Learning Frameworks (part 7)

2020 has been an exciting time for DL frameworks and the AI stacks. We have seen more consolidation of frameworks into platforms that are domain-specific such as NVIDIA Omniverse and NVIDIA Clara. We have seen better abstractions in the AI stack that helps democratize AI and enable rapid prototyping and testing such Pytorch Lightning.

Below are some frameworks that my team at NVIDIA has been involved in building.

This is part of the blog series on 2020 research highlights. You can read other posts for research highlights on generalizable AI (part 1)handling distributional shifts (part 2)optimization for deep learning (part 3)AI4science (part 4)controllable generation (part 5), learning and control (part 6).

Announcing Tensorly-Torch

TensorLy-Torch is a PyTorch only library that builds on top of TensorLy and provides out-of-the-box tensor layers to replace matrix layers in any neural network. Link

  • Tensorize all layers of a neural network: This includes Factorized convolutions fully-connected layers and more!
  • Initialization: initializing tensor decompositions can be tricky since default parameters for matrix layers are not optimal. We provide good defaults to initialize using our tltorch.init module. Alternatively, you can initialize to fit the pretrained matrix layer.
  • Tensor hooks: you can easily augment your architectures with our built-in hooks. Robustify your network with Tensor Dropout. Automatically select the rank end-to-end with L1 Regularization.
  • Methods and model zoo: we are always adding more methods and models to make it easy to compare the performance of various deep tensor-based methods!

Minkowski Engine

Minkowski Engine is an auto-differentiation library for sparse tensors. It supports all standard neural network layers such as convolution, pooling, and broadcasting operations for sparse tensors. Popular architectures include 3D and higher-order vision problems such as semantic segmentation, reconstruction, and detection. Link

  • Unlimited high-dimensional sparse tensor support
  • All standard neural network layers (Convolution, Pooling, Broadcast, etc.)
  • Dynamic computation graph
  • Custom kernel shapes
  • Multi-GPU training
  • Multi-threaded kernel map
  • Multi-threaded compilation
  • Highly-optimized GPU kernels

End-to-end Reinforcement Learning on GPUs with NVIDIA Isaac Gym

We are excited about the preview release of Isaac Gym – NVIDIA’s physics simulation environment for reinforcement learning research that dramatically speeds up training. These environments are physically valid allowing for an efficient sim-to-real transfer. These include a robotic arm, legged robots, deformable objects, and humanoids.  Blog

Stay tuned for more in 2021! Here’s looking forward to exciting developments in AI in the new year.

2020 AI Research Highlights: Learning and Control (part 6)

Embodied AI is the union of “mind” (AI) and “body” (robotics). To achieve this, we need robust learning methods that can be embedded into control systems with safety and stability guarantees. Many of our recent works are advancing these goals on both theoretical and practical fronts. 

This is part of the blog series on 2020 research highlights. You can read other posts for research highlights on generalizable AI (part 1)handling distributional shifts (part 2)optimization for deep learning (part 3)AI4science (part 4), controllable generation (part 5).

Safe Exploration and Planning

My journey into this area of learning and control started with the neural lander. We used deep learning to learn the aerodynamic ground effects in drones. This led to improved landing speed without sacrificing stability requirements. In a subsequent work, we aimed to automate the collection of drone data while staying safe. 

Safe landing
Aggresive landing

We employed robust regression methods with guaranteed uncertainty bounds that guarantees safety even outside of the training domain. This allows the drone to progressively land faster while maintaining safety (i.e. not crashing). Our method trains a density-ratio estimator that accurately predicts the ability to maintain safety at higher speeds. This is based on the principle of adversarial risk minimization, that has also shown gains in sim-to-real generalization in computer vision (post 2).

We employed our method on a simulator, built with data collected from real drones. Our method is superior to popular Gaussian process (GP) method for uncertainty quantification and leads to faster exploration while maintaining safety. This is because GPs are brittle in high dimensions due to poor choice of kernels/priors.  

The ability to explore safely can now be combined with downstream trajectory planning methods in control. It allows us to propagate uncertainty bounds from robust regression and we pose it as chance constraints for planning methods. Thus, we can compute a pool of safe and information-rich trajectories.

Learning methods with accurate uncertainty bounds enable safe trajectory planning

The episodic learning framework is applied to the robotic spacecraft model to explore the state space and learn the friction under collision constraints. We show a significant reduction in variance of the learned model predictions and the number of collisions using robust regression models.

Reinforcement learning in control systems

Analyzing RL in control systems is challenging due to the following reasons: (1) state and action spaces are continuous (2) safety and stability requirements (3) partial observability.

A canonical setting is the linear quadratic Gaussian (LQG) that involves linear dynamics evolution and linear transformation of the hidden state to yield observations with Gaussian noise. LQG appears deceptively simple, but is notoriously challenging to analyze.

Previous methods focused on open loop control which uses random excitation (i.e. actions) to collect measurements for model estimation. However, this yields a regret of T^0.66 which is not optimal, where T is the number of time steps. Paper

Our method is the first closed-loop RL method with guaranteed regret bounds. In closed-loop control, the past measurements are all correlated with the control actions which makes it challenging to estimate the model parameters. We utilize tools from classical control theory (predictive form) to guarantee consistent estimation of the model parameters. This yields an improved regret bound of T^0.5. Paper

Surprisingly, we can do better in terms of the regret bound. We showed that combining online learning with episodic updates can lead to logarithmic regret. Intuitively, we decouple adaptive learning of model parameters (episodic updates) with online learning of control policy. This combination allows us to achieve fast learning (with low regret) in closed-loop control. Paper