AI4Science in the limelight: We developed AI methods for simulating complex multi-scale phenomena such as weather, materials, etc. with orders of magnitude speed-ups. Our core technique, Fourier Neural Operator (FNO), was recently featured as a highlight of math and computer science advances in 2021 by Quanta Magazine. It was also featured in the GTC Fall keynote and the IamAI video, released by NVIDIA. In addition, we advanced drug discovery through Orbnet that predicts quantum-mechanical properties with thousands of times speedup. Both these techniques were part of publications that were finalists for the Gordon-Bell special prize for Covid-19 research.
Quantum ML research taking roots: Our foray into quantum research was aided by our research on tensor methods for deep learning, since tensor networks represent quantum systems efficiently. We developed a new algorithm for quantum optimization that halved the quantum resources needed to solve classical optimization such as maxcut, which implies an exponential reduction for simulating them on GPUs. We partnered with the NVIDIA cuQuantum team and established a world-record for large-scale simulation of a successful, nonlocal quantum optimization algorithm, and open-sourced Tensorly-Quantum.
Trustworthy AI no longer just a wish list: Forbes predicts that trustworthy AI will be operationalized in the next year. Trustworthy AI has many facets: improving uncertainty calibration, auditing AI models and improving robustness. We improved the robustness of AI models through various approaches: certifying robustness, balancing diversity and hardness of data augmentations, and enhancing robustness of 3D vision. We demonstrated that language models can detect different kinds of social biases without any re-training, when supplied with a small number of labeled examples.
No more supervision: 99% of computer vision teams have had a ML project canceled due to insufficient training data, according to a survey. We have made strides in weak and self-supervised learning (Discobox) that are competitive with supervised learning methods. We have also developed efficient controllable generation methods that can do zero-shot composition of attributes. Gartner predicts that by 2024, synthetic data will account for 60% of all data used in AI development. We also developed efficient methods for automatic camera calibration using neural radiance field (NERF) models.
Transformers transforming vision and language: The Cambrian explosion of transformer architectures continued this year, with a focus on harder tasks and multimodal domains. We developed Segformer for semantic and panoptic segmentation with SOTA performance and efficiency, which is being used by multiple teams across the company. We enabled linear efficiency in self-attention layers using long-short decomposition and adaptive Fourier neural operator.
Bridging the gap with biological intelligence: Humans are capable of zero-shot generalization, and can handle long-tailed distributions. We developed efficient controllable generation methods that can do zero-shot composition of attributes. We showed that simple memory-recall strategies enable efficient long-tailed object detection. Such capabilities have been framed as formative AI intellect byGartner.
Hardware efficiency: We co-designed both a quantization scheme and an AI training method to obtain energy efficiency and accuracy. The logarithmic number system (LNS) provides a high dynamic range, but the non-linear quantization gaps make it challenging to train using standard methods such as SGD. Instead, we employed multiplicative updates that are able to train AI models directly in LNS using just 8 bits, without requiring any full-precision copies. This resulted in no accuracy loss and 90% reduction in energy.
Hub of collaborations: Grateful to be supported by an amazing network of collaborators across multiple institutions in a wide array of domains. We are excited about the announcement of Earth-2 that will enable NVIDIA to partner with researchers in climate science globally.
Personal touch: 2021 has been a greatly fulfilling year on both personal and professional fronts. I spent the beginning of the year in Hawaii, a dream playground, where I got to swim every evening into the sunset after my meetings. I started weight training and was surprised at being able to lift my own body weight! Focusing on my physical and spiritual health has greatly enhanced my creativity and productivity. During the latter half of the year, I got to attend some events in person. A highlight was a trip to CERN where I got to tour the particle accelerator and the antimatter factory; my interview, Stealing theorist’s lunch, was published in the CERN Courier magazine. I got to participate in an unusual documentary that featured our fishing trip at Jackson Hole where we collected snapshots of casts of fly fishing and trained AI to predict good casts. I also participated in latenightIT show, hosted by the Emmy nominated Baratunde Thurston. Here’s looking forward to new adventures in 2022!
2020 has been an exciting time for DL frameworks and the AI stacks. We have seen more consolidation of frameworks into platforms that are domain-specific such as NVIDIA Omniverse and NVIDIA Clara. We have seen better abstractions in the AI stack that helps democratize AI and enable rapid prototyping and testing such Pytorch Lightning.
Below are some frameworks that my team at NVIDIA has been involved in building.
TensorLy-Torch is a PyTorch only library that builds on top of TensorLy and provides out-of-the-box tensor layers to replace matrix layers in any neural network. Link
Tensorize all layers of a neural network: This includes Factorized convolutions fully-connected layers and more!
Initialization: initializing tensor decompositions can be tricky since default parameters for matrix layers are not optimal. We provide good defaults to initialize using our tltorch.init module. Alternatively, you can initialize to fit the pretrained matrix layer.
Tensor hooks: you can easily augment your architectures with our built-in hooks. Robustify your network with Tensor Dropout. Automatically select the rank end-to-end with L1 Regularization.
Methods and model zoo: we are always adding more methods and models to make it easy to compare the performance of various deep tensor-based methods!
Minkowski Engine is an auto-differentiation library for sparse tensors. It supports all standard neural network layers such as convolution, pooling, and broadcasting operations for sparse tensors. Popular architectures include 3D and higher-order vision problems such as semantic segmentation, reconstruction, and detection. Link
Unlimited high-dimensional sparse tensor support
All standard neural network layers (Convolution, Pooling, Broadcast, etc.)
Dynamic computation graph
Custom kernel shapes
Multi-threaded kernel map
Highly-optimized GPU kernels
End-to-end Reinforcement Learning on GPUs with NVIDIA Isaac Gym
We are excited about the preview release of Isaac Gym – NVIDIA’s physics simulation environment for reinforcement learning research that dramatically speeds up training. These environments are physically valid allowing for an efficient sim-to-real transfer. These include a robotic arm, legged robots, deformable objects, and humanoids. Blog
Stay tuned for more in 2021! Here’s looking forward to exciting developments in AI in the new year.
I am very happy to share the news that I am joining NVIDIA as Director of Machine Learning Research. I will be based in the Santa Clara HQ and will be hiring ML researchers and engineers at all levels, along with graduate interns.
I will be continuing my role as Bren professor at Caltech and will be dividing my time between northern and southern California. I look forward to building strong intellectual relationships between NVIDIA and Caltech. There are many synergies with initiatives at Caltech such as the Center for Autonomous Systems (CAST) and AI4science.
I found NVIDIA to be a natural fit and it stood out among other opportunities. I chose NVIDIA because of its track record, its pivotal role in the deep-learning revolution, and the people I have interacted with. I will be reporting to Bill Dally, the chief scientist of NVIDIA. In addition to Bill, there is a rich history of academic researchers at NVIDIA such as Jan Kautz, Steve Keckler, Joel Emer, and recent hires Dieter Fox and Sanja Fidler. They have created a nourishing environment that blends research with strong engineering. I am looking forward to working with CEO Jensen Huang, whose vision for research I find inspiring.
The deep-learning revolution would not have happened without NVIDIA’s GPUs. The latest Volta GPUs pack an impressive 125 teraFLOPS and have fueled developments in diverse areas. The recently released NVIDIA Tesla T4 GPU is the world’s most advanced inference accelerator and NVIDIA GeForce represents the biggest leap in performance for graphics rendering since it is the world’s first real-time ray tracing GPU.
As many of you know, NVIDIA is much more than a hardware company. The development of CUDA libraries at NVIDIA has been a critical component for scaling up deep learning. The CUDA primitives are also relevant to my research on tensors. I worked with NVIDIA researcher Cris Cecka to build extended BLAS kernels for tensor contraction operations a few years ago. I look forward to building more support for tensor algebraic operations in CUDA which can lead to more efficient tensorized neural network architectures.