10 things that happened last decade

  1. Got a faculty position at UC Irvine during the recession. Attended N(eur)IPS conference for the first time.
  2. Discovered the power of tensors for machine learning and it became my core research focus.
  3. Received many awards such as Sloan fellowship, Microsoft faculty fellowship and NSF Career Award.
  4. Moved to AWS as principal scientist to build some of the first cloud AI applications. Got to deploy tensor algorithms in production.  
  5. Became the youngest named chair professor at Caltech, the highest honor bestowed on individual faculty.
  6. Started the ML research group at NVIDIA while continuing my work at Caltech.
  7. Founded AI4science with Yisong Yue to accelerate interdisciplinary AI research at Caltech.
  8. Shared my #meToo experiences and pushed for NeurIPS name change to help improve the climate for women and minorities in AI. 
  9. Having grandkids: academically speaking. Many of my students and mentees became faculty members and formed thriving research groups.
  10. Continue to learn and grow. Lucky to have had amazing opportunities and experiences.


Research Highlights of 2019

2019 was an interesting year in so many ways. I was able to build and solidify research programs, both at NVIDIA and at Caltech. I was able to continue working towards diversity and inclusion in AI, and saw a lot of visible improvements (recent incident I wrote in my previous blog post is a notable exception).  Overall, there is a lot of positivity and a great way to end an eventful decade!

Before I list research highlights that I was personally involved in, here’s an overall summary for the AI field from my viewpoint. This was published on KDnuggets.

In 2019, researchers aimed to develop a better understanding of deep learning, its generalization properties, and its failure cases. Reducing dependence on labeled data was a key focus, and methods like self-training gained ground. Simulations became more relevant for AI training and more realistic in visual domains such as autonomous driving and robot learning, including on NVIDIA platforms such as DriveSIM and Isaac. Language models went big, e.g. NVIDIA’s 8 billion Megatron model trained on 512 GPUs, and started producing coherent paragraphs. However, researchers showed spurious correlations and undesirable societal biases in these models. AI regulation went mainstream with many prominent politicians voicing their support for ban of face recognition by Governmental agencies. AI conferences started enforcing a code of conduct and increased their efforts to improve diversity and inclusion, starting with the NeurIPS name change last year. In the coming year, I predict that there will be new algorithmic developments and not just superficial application of deep learning. This will especially impact “AI for science” in many areas such as physics, chemistry, material sciences and biology.

New book on Spectral Learning on Matrices on Tensors

Builds up spectral methods from first principles. Applications to learning latent variable models and deep neural networks. Order your copy here.

Better Optimization Methods

Fixing training in GANs through competitive gradient descent

(Excellent blog post by Florian here). In contrast to standard simultaneous gradient updates, CGD guarantees convergence and is efficient. NeurIPS poster below:

Screen Shot 2020-01-01 at 3.51.05 PM.png

Application of CGD to GAN training and demonstrating its implicit competitive regularization (NeurIPS workshop):

Implicit Competitive Regularization-poster-revised (2)

Guaranteed convergence for SignSGD

SignSGD compresses gradient to a single bit but has no significant loss in accuracy in  practice. Theoretically, there are convergence guarantees. Paper. Main theorem:

Screen Shot 2020-01-01 at 1.57.05 PM.png

Generative Models

Exciting collaboration between ML and neuroscience with Doris Tsao at Caltech. Adding feedback generative model to convolutional neural networks significantly improves robustness in tasks such as unsupervised denoising. Short paper here.


Robust Learning in Control Systems

Neural Lander

First work to successfully demonstrate use of deep learning to land drones with stability guarantees. Collaboration under CAST at Caltech. Paper at ICRA 2019.

stable-drone.jpgRobust regression for safe exploration

We address the following: How to extrapolate robustly from your training data in real world control tasks and achieve end to end stability guarantees in safe exploration? Paper.robust-regression

Multi-modal learning for UAV navigation

Multi-modal fusion of vision and IMU improves robustness in navigation and landing. Paper. Screen Shot 2020-01-01 at 12.46.33 PM.png

Generalization in ML

Detecting hard examples through an angular measure


Watch my GTC-DC talk. Angular alignment is a robust measure for hardness: easier examples align more with the target class. We found that correspondence between angular measure and human selection frequency was statistically significant. Improves self training in domain adaptation. Paper.

Regularized learning for domain adaptation

New domain adaptation algorithm to correct for label shifts. Paper

Our ability to fix bias in deep-learning algorithms

Twitter thread here

Screen Shot 2020-01-01 at 11.55.35 AM.png

Neural Programming

Recursive neural networks with external memory

stack-recursiveRecursive networks have compositionality and can extrapolate to unseen instances.

Extrapolation to harder instances (higher tree depth) is challenging.

We show that augmenting with external memory stacks significantly improves extrapolation. Paper

Open Vocabulary Learning on Source Code with a Graph–Structured Cache

Use of syntax trees in program code to handle unbounded vocabulary.  Paper



Reinforcement Learning and Bandits

Robust off-policy evaluation

Robust methods to handle covariate shift in off-policy evaluation. Papertriply.jpgStochastic Linear Bandits with Hidden Low Rank Structure

Low regret algorithms for discovering hidden low rank structures in bandits. Paper


Competitive Differentiation for Lagrangian Problems

Many RL problems have constraints leading to Lagrangian formulation.  PaperPoster_A_Lagrangian_Method_for_Inverse_Problems_in_Reinforcement_Learning

Context-based Meta RL with Structured Latent Spacehongyu-poster

AI4science and Other Applications

Neural ODEs for Turbulence Forecasting

Turbulence modeling is a notoriously hard problem. Exciting initial results presented at NeurIPS workshop.


Tackling Trolling through Dynamic Keyword Selection Methods

Can we find social media trolls and create more trustworthy environment online? We presented our ongoing study of the #meToo movement, in collaboration with Michael Alvarez from Social sciences at Caltech.



Quantum Supremacy = Racism?

Can a name change provide a quantum leap in diversity and inclusion?

I didn’t expect to be writing this blog post. I had hoped for an internet-free week of vacation with my family in India. I had hoped to disconnect and rejuvenate. I had hoped to write a year-end message with positive changes I saw this year (which I still plan to do). Alas! My plan to quickly browse Facebook turned into a Twitter+Reddit storm and renewed  trolling that I had seen during last Christmas (more on that later). So here I am up at 5AM writing this post so that my vacation plans remain intact.

Scott Aaronson’s blog has become hugely popular over the years. I started reading them as a graduate student and quickly became a fan! When I started blogging, I aspired to mirror the clarity and accessibility that he brought to a complex scientific field. So thank you, Scott, for providing the public with this valuable resource!

The saga started a few weeks ago when I saw Scott Aaronson’s blog post on quantum supremacy.  I commented on Scott’s Facebook feed, having been his Facebook friend for many years. I didn’t take any stance on renaming of quantum supremacy. Instead, I focused on my experiences with renaming of NIPS: Neural Information Processing to NeurIPS last year. I shared many resources: here and here.

I saw many parallels and hoped that the resources would guide Scott and others towards enlightenment, and that would be the end of the story. Sadly, what followed was a big “punch in the stomach”, and that too has parallels with my experiences around the NeurIPS name change movement.

But before that, some background. I learnt that the term “Quantum supremacy” was first coined by John Preskill, my wonderful colleague at Caltech. Coincidentally, even NIPS was started at Caltech by Yaser Mostafa and Ed Posner in 1987. Read about its wonderful backstory involving Richard Feynman, John Hopfield and Carver Mead. I have deep admiration for these trailblazers at Caltech, and they inspire me every day to innovate and do pathbreaking research.

When I first heard the term “quantum supremacy”, my brain quickly associated it with “white supremacy”. This is natural since this is the most common use of the term. To demonstrate that I am not alone in this, I did a quick Google news search. I went on incognito mode, so as to remove my personalized recommendations. I used VPN to connect to the  California region since that would be my default location. Here’s what came up:

I got to know from Scott’s blog post that Leonie Mueck had first written about the problematic nature of the term. I had to Google to find the article.  Scott didn’t bother to link to the article, a common courtesy even when you disagree with its contents, as I am doing here. (This seemingly minor slight became a consistent issue of not giving credit to women as I personally experienced. More on that later.) Updated with a better link to discussion on quantum supremacy discussion, thanks to a young researcher, which Scott failed to provide on his blog.

Following my Facebook comments, Scott wrote a weak response to the effect “NIPS became a problem only because of juvenile Tshirts and quantum field does not have this problem”. He forgot one important word though: “yet”. In any case, I did not want to get entangled (pun intended) in another name-change issue. I hoped that he would read the articles I sent and would see how name change can bring about many positive changes for women and minorities.

Instead, in the middle of my vacation, I see a guest post on Scott’s blog by the notorious Steven Pinker!!!  Pinker blabbers on how NeurIPS name change was a severe misstep and blames it for the public not believing in climate change! I won’t bother to disentangle this hot mess. Feel free to read it at your own risk. This same Pinker had attempted to get me banned from Twitter at the height of NeurIPS name change, because I disagreed with him. He had directed his followers to extensively troll me. I even received gun threats and was worried for my personal safety.

Back to Scott’s blog, conspicuously missing in the post was any attribution to me for bringing up the NeurIPS name-change issue to his attention in the first place. I proceeded to comment on his Facebook post. Scott came up with: “I don’t mention people who are not public figures!” I found this terribly demeaning. My blog, my Twitter feed, extensive news articles revealed by a simple Google search would prove otherwise. But sadly Scott did not bother to do any of that or offer to update his blog after I raised this issue.

In my comments, I also brought up his amazingly accomplished wife Dana Moshkovitz. We share a bond: we were both victims of sexual harassment by Yuval Peres. (See Lior’s brave blog on this here). I have deeply admired her bravery for talking about her experiences more than a decade ago. Back then, it was a severe taboo and #meToo had not happened yet. Dana had reported Yuval to Microsoft HR and sadly saw no justice.  I asked Scott to discuss the NeurIPS name change with Dana, since I was sure that she would provide a different perspective. (Dana has written a detailed gracious email to me. I will be reaching out to her and asking her on what aspects, if any, could be made public).

Following my comment, to my dismay, Scott proceeds to unfriend me on Facebook. His justification: we were never friends in the first place! He also does not approve my follow-up comments on his blog for more than a day. Instead, he comments extensively disparaging me on his own post. It does not deserve to be repeated here. I allude to some of them on my Twitter thread where there is a thriving discussion.  A majority of responses on Twitter have been overwhelmingly supportive and it reinforces my belief that I am on the right side of history.


Here are the parallels I see between name change of NIPS and quantum supremacy:

  • Initially, I myself didn’t think NIPS was a big deal. I started attending when it was a small intimate affairs: we discussed posters until midnight, skied during workshops and attended the famous “Gatsby” party on the last day.  People like myself who are  entrenched in the field are the wrong people to make decisions on the name change. I changed my mind after I learned that young women had a problem and did not feel comfortable attending an event with this name. So it is natural that Scott does not see an issue with the term “quantum supremacy”.
  • NIPS especially became a problematic word as AI expanded and saw a huge influx of “techbros” into the event. Scott’s defense is that this is not an issue with the quantum field.  But this will soon change. There is now immense public interest and news articles on quantum supremacy appear almost daily, especially following the recent Google announcement. Changing the name pre-emptively before the community explodes prevents harassment of minorities, and sends a strong message of commitment on inclusion and creation of welcoming environment to everyone.
  • A Google search is a good indicator for assessing our internal biases and our perception of different words. I used this technique when I started my #protestNIPS petition. It doesn’t matter what kind of linguistic expertise Pinker has (which is debatable), what matters is the common perception of the term. A screenshot of the Google image search I did for the petition. As Scott points out, his candy does show up in the search. Male nipples also show up. But what about the majority?  Similarly, my search on Google news that I posted earlier on “supremacy” refers mostly to white supremacy.  nips.jpg
  • Women and minorities experience undue emotional burden in our science and tech communities, especially in computer science where the number of women is miniscule. Name change has no negative impact on the privileged class, and only positive impact on the marginalized classes. So why not just do it? Scott’s attempts to silence me and other women sends a negative message that their opinions do not matter and that they are not welcome here.
  • If name change is a minor frivolous matter, why are intellectual men like Aaronson and Pinker so vehement on opposing it? Pinker’s reasoning that name change has made us a laughing stock is sheer lunacy! Who is laughing at us? The joke is on them.

Here are all the positive changes that came about following the name change.

  • A code of conduct has been established for all participants, and especially, for corporate sponsors.  The infamous corporate parties of 2017 and before, with misogynistic rappers, sexualized dancers with a stripper vibe, TITS parties, and Elon Musk making juvenile jokes about “nips and tits” are thankfully a thing in the past. Never again! Read about them here. 
  • Celeste Kidd got a standing ovation at the opening session of NeurIPS 2019. She gave an excellent talk on belief formulation and idea validation. True to her trailblazing nature, she ended the talk with a discussion on the #meToo movement and the role of men. See here. She has been a brave advocate and was amongst the Silence Breakers who were namedTime Person of the Year in 2017. Such a reception would be inconceivable a few years ago.
  • Diversity and inclusion chairs are now a part of organizing committee of every major AI conference. (Proud to be D&I chair for ICLR 2020 which will be the first international AI conference to be held in the African continent in Ethiopia).
  • Diversity and inclusion has expanded vastly to other groups such as Black in AI, Latinx in  AI, and Queer in AI. The joint poster session for the affinity groups this year at NeurIPS was a highlight and was well attended by everyone, beyond the group members. (As an NVIDIAN, I was proud to see that NVIDIA sponsored all four groups and invited their members to our networking lunch. Slides from the lunch here).
  • NeurIPS socials were organized for the first time by volunteers. In contrast to toxic corporate events I mentioned earlier, this was a sea of positivity. I attended “BUDS” where I helped mentor budding researchers. Being among young people with so much passion continues to inspire me. They also gave me a nice personalized gift.

I am the first to concede that it is hard to establish a causal link since controlled experiments were not carried out (wink).  But based on thousands of responses I have received (both written and oral), #protestNIPS name change has been a major catalyst in improving the climate to women and minorities in AI. This is beyond dispute.

Last year, a similar attempt to ruin my holidays was carried out by another famous bully, when I asked him to be a better leader and moderate his feed that included the n-word and jokes on making sex tapes. You can read about this on blog post by the brave Layla El Asri here.  I have a theory on the neuro-sis (!) of these powerful men of science spreading toxicity during the holiday season. “An empty mind is a devil’s workshop” and free time during holidays brings out the worst in them.  These men have a constant need for attention and deep insecurities. I hope they focus on their own wellness and explore yoga/meditation and other mechanisms to connect with their inner selves. This will make world a far better place.

Scott’s role as an unofficial spokesperson for an entire scientific discipline comes with great responsibility. I am sad to say that he has failed miserably at this with the recent events.  I hope he learns from this and becomes an ally. His daughter would greatly benefit (I know this since he brought her up in his blog post as a defensive shield).

Finally, I want to comment on the previous title I chose for the post. I used the term “neo Nazis” since that is closely associated white supremacy. But by no means am I equating the violence perpetrated by neo Nazis with the debate around quantum supremacy. But I want to be inclusive and I am open to feedback. When a group of people felt uncomfortable with it, I decided to discard it. Unlike Pinker and Aaronson, I am open to changing names and titles if that helps create a better environment ! Let us hope we can all find common ground.

Happy holidays everyone! Here’s hoping to a better 2020.

Thank you all!

The Thanksgiving weekend has been a much needed staycation. Given how much I travel rest of the year, I feel lucky to avoid all the weather-related chaos this weekend. I got to even see the rare views of snow-capped mountains juxtaposed with palm trees as I worked from my office at Caltech.


I got the time to reflect on so many things that I am thankful for. I am lucky to be part of two amazing communities at Caltech and NVIDIA. This year I have deepened my professional relationships at both the places and also made many friends and found wonderful supportive mentors. I am thankful for amazing colleagues from whom I have learnt so much. I have amazing team of researchers at both the places. Their curiosity and passion continues to inspire me.

I am thankful for all the support and encouragement that I have received when I have spoken about the need for better diversity and inclusion in our tech communities. Having allies and raising awareness has been incredibly fulfilling. I am especially thankful to also those who had the humility and honesty to tell me that they were wrong or unaware and that I had changed their mind. I hope you can pass it along. Together we can build a healthy community where everyone can thrive.

I am so thankful to my family. They are my rock and have been so supportive through everything I went through this year. I am thankful to having a new addition to my family:  my sister-in-law Prashanthi who displays wisdom and maturity beyond her years.

I should add that it hasn’t always been easy for me to show appreciation and communicate how much I am thankful to everyone. I  strive to make things better in all walks of my life, both personal and professional. This means my mind is attuned to finding gaps, calling them out, and trying to fix things. I believe that it is possible to strike a balance: being thankful for the present while striving for a better tomorrow.

Have a great Thanksgiving weekend!

#meToo: the story that never ends

Trigger Warning. Description of sexual harassment, physical intimidation and misogynistic comments below.

You will find below a small subset of my nightmarish experiences in the hands of an abusive manager at a previous organization. By no means was my experience isolated. In fact, that was the norm. Every senior woman on my team has since left (and many men I respected and admired). The ones who have advanced have been primarily men with similar qualities to the one I describe below. This behavior is a feature not a bug and you will find it all the way up to the top-most echelons at this place.

He came into the room charging at me with rage in his eyes. I could feel the wind sweep into my face as he waved his arms angrily and hovered me. He invaded my personal space in a matter of seconds. At the last minute he withdrew but his anger was magnified even more. He accused me of trying to destroy a billion-dollar business. None of that was true. None of it made any sense. It was just another bout of his mental illness. It was just another day in my life with him as my manager.

His kind of psychopathy is not uncommon in the tech world. What’s worse, it’s simply attributed to being “on the spectrum”.  Men like him are geniuses; Gods who demand blood sacrifice of people working for him. They derive their power from harassment and intimidation. Men like him not only get away but also get promoted.

“There is an entire dossier on him. I was brought in with specific instructions to baby-sit him” told me his secretary. “He routinely stares at my boobs. I can’t do anything about that”. She also shared how he drove a junior employee to extreme trauma that she was worried if he had turned suicidal.

I later found out that in his previous roles he had driven out co-founders. But every one of them wanted to stay away and was terrified of ever running into him. “I still get nightmares” shared a guy who was physically far more imposing than my harasser, but that didn’t matter. My harasser knows how to get into someone’s psyche.

He had been even worse to his students. He physically intimidated one of his students and later called the cops on him. He even slapped a restraining order against the poor student.

“People want to work with you only because of your looks. You should use that to your advantage” I could not believe what I had heard. He continued: “Go be a booth babe and take one for the team”.

Men like him have an extreme need for control. He once got angry because I needed to take a bathroom break. “My time is precious. You can go later.” He tried to control every part of my life. He would send me text messages later in the evening and get angry when I didn’t respond. He wanted to know my dating life. He would give tons of unsolicited advice on who I should date. “It is so easy for you as a woman. There are only men around here. You can have your pick and you should enjoy it.”

Men like him want to break you and tear you apart. “You need to do better. You are not performing to my expectations”. When I pointed out that I am working 24/7 with no other life outside, he continued “you should not work late. You should sleep well because biology is not on your side. There is a ten-year gap on how biology plays havoc on women’s looks and you know what else..” He flashed a psychopathic grin wide across his face “you still want to have children, don’t you? You have so much working against you”.

All my complaints led to nothing other than my exit. Many others complained and supported me. He was found guilty by the company HR, but nothing else happened. There were so many enablers. There were people who I thought were my friends. They broke my heart when they abandoned me and took his side.

I decided that such an organization did not deserve me. I moved on. Fast forward, I am now in two great organizations: Caltech and NVIDIA, that respect me and empower me.

But men like him never want to admit defeat. He continues to intimidate me at conferences. Now, he wants to visit Caltech and recruit personally. Men like him want to own everything. They want to keep conquering and controlling others. For him, visiting Caltech is a great way to grab his power back. But I will not let this happen, because I don’t want others, especially young and powerless students, to go through what I did.

I am not afraid. He may have a large conglomerate behind him, but I have the truth guarding me. That is my courage. That is my strength. And I will keep fighting until my last breath.

One of my wonderful friends shared this with me to support me.


Out of the night that covers me,
      Black as the pit from pole to pole,
I thank whatever gods may be
      For my unconquerable soul.
In the fell clutch of circumstance
      I have not winced nor cried aloud.
Under the bludgeonings of chance
      My head is bloody, but unbowed.
Beyond this place of wrath and tears
      Looms but the Horror of the shade,
And yet the menace of the years
      Finds and shall find me unafraid.
It matters not how strait the gate,
      How charged with punishments the scroll,
I am the master of my fate,
      I am the captain of my soul.

Call for accountability of deployed AI services

Edit: I want to clarify during my tenure at AWS, I did not work on face recognition or was involved in any of the decisions to sell it to law enforcement. I added this after a journalist asked me about this. 
About two weeks ago, Timnit Gebru and Margaret (Meg) Mitchell approached me with a request to sign a letter outlining scientific arguments to counter the claims made by Amazon representatives regarding their face recognition service and calling them to stop selling it to law enforcement.
I am one of 26 signatories. This includes many veteran leaders in the community including Yoshua Bengio, one of the Turing award winners this year. So I am in good company 😉 In addition, there are numerous other groups which have called for Amazon to stop selling it to the police.
Screen Shot 2019-04-03 at 10.10.12 AM
Joy Buolamwini and Inioluwa Deborah Raji have done amazingly in-depth research on this topic and you can check it out at the gendershades website. So all the credit goes to them for laying this strong foundation.
When I read the letter I was happy to see careful factual arguments being made that are grounded in science. My hope is that the letter opens up a public dialogue on how we can evaluate face recognition (and other AI services), both in terms of metrics, but also the social context in which it is being deployed.
I am a former member of the AWS AI group and I want to clarify I have at most admiration of how AWS has transformed the developer ecosystem.  AWS services have removed a lot of “heavy lifting” in DevOps and democratized software development. I am hoping that this letter leads to productive dialogue and we can collectively work towards enhancing the beneficial uses of AI.
Govt. regulation can only come about once we have laid out technical frameworks to evaluate these systems. The gendershades paper shows how our current evaluation metrics are broken, and it starts with imbalanced training data. So we need a variety of different ways to evaluate the system and we need accountability from currently deployed AI services.  In short, regulation is only part of the answer but is badly needed.
Update: AWS released a FAQ outlining guidelines of how face recognition should be used.  Unfortunately, this does not solve anything. https://aws.amazon.com/rekognition/the-facts-on-facial-recognition-with-artificial-intelligence/

An open and shut case on OpenAI

The views expressed here are solely my own and do not in any way reflect those of my employers.

This blog post is meant to clarify my position on the recent OpenAI controversy. A few days ago, I engaged with Jack Clark, who manages communications and policy for OpenAI, on Twitter. It is hard to have a nuanced discussion on Twitter and I am writing this blog post to better summarize my thoughts. For a longer and more thorough discussion on this topic, see excellent blog posts by Rob Munro and Zack Lipton.

The controversy: OpenAI released their language model a few days ago with a huge media blitz Screen Shot 2019-02-18 at 1.44.52 AM.png

My Twitter comments:

Screen Shot 2019-02-18 at 1.45.52 AM

Screen Shot 2019-02-18 at 1.46.30 AM.png

Screen Shot 2019-02-18 at 1.47.47 AM.png

When OpenAI was started a few years with much fanfare, its core mission was to foster openness in AI. As a non-profit, it was meant to freely collaborate with other institutions and researchers by making its patents and research open to the public.  I find this goal highly admirable and important.

I also have a deep admiration for Jack Clark. His newsletter has been a great resource for the community to keep up with the latest updates in machine learning. In the past, he has pushed for more openness from the ML community. When the NeurIPS conference banned journalists from attending the workshops, he protested on Twitter and I supported his stance.

Screen Shot 2019-02-18 at 1.59.17 AM.png

On the other hand, OpenAI seems to be making a conscious effort to move away from this open model and from its core founding principles. A few months ago, Jack Clark wrote this on Twitter:

Screen Shot 2019-02-18 at 2.03.35 AM.png

So why do I feel so strongly about this? Because I think that OpenAI is using its clout to make ML research more closed and inaccessible. I have always been a strong proponent for open source and for increasing reproducibility and accountability in our ML community. I am pushing to make it compulsory in our machine-learning conferences. See my recent blog post on this topic.

I am certainly not dismissive of AI risks. It is important to have a conversation about it and it is important to involve experts working on this topic. But for several reasons, I believe that OpenAI is squandering an opportunity to have a real conversation and distorting the views to the public. Some of the reasons are:

  • OpenAI is severely playing up the risks of releasing a language model. This is an active area of research with numerous groups working on very similar ideas. Even if OpenAI kept the whole thing locked up in a vault, another team would certainly release a similar model.
  • In this whole equation, it is academia that loses out the most. I have previously spoken about the severe disadvantage that academic researchers face due to lack of reproducibility and open source code. They do not have the luxury of a large amount of compute and engineering resources for replication.
  •  This kind of fear-mongering about AI risks distorts science to the public. OpenAI followed a planned media strategy to provide limited access to their model to a few journalists and fed them with a story about AI risks without any concrete proofs. This is not science and does not serve humanity well.

A better approach would be to:

  • Go back to the founding mission and foster openness and collaboration. Engage with researchers, especially academic researchers; collaborate with them, provide them the resources and engage in the peer-review process. This is the time-tested way to advance science.
  • Engage with experts on risk management to study the impacts of AI. Engage with economists to study the right incentive mechanisms to design for deployment of AI. Publish those studies in peer-reviewed venues.

Numerous other scientists have expressed a similar opinion. I hope OpenAI takes this feedback and acts on it.

Reproducibility Debate: Should Code Release be Compulsory for Conference Publications?

Update: Added discussions in the end based on Twitter conversations. 

Yesterday, I was on the debate team at DALI conference in gorgeous George in South Africa. The topic was:

“DALI believes it is justified for industry researchers not to release code for reproducibility because of proprietary code and dependencies.”

I was opposing the motion, and this matched by personal beliefs.  I am happy to talk about my own stance but I cannot disclose the arguments of others, since it was off the records (and their arguments were not necessarily their own personal opinions).

Edit: Uri Shalit and I formed the team opposing the motion. I checked with him to see if he is fine with me mentioning it. We collaboratively came back with the points below. 

This topic is timely since ICML 2019 has added reproducibility as one of the factors to be considered by the reviewers. When it first came up, it seemed natural to set standards for reproducibility: the same way we set standards for a publication at our top-tier conferences. However, I was disheartened to see vocal opposition, especially from many “big-name” industry researchers. So with that background, DALI decided to focus the reproducibility debate on industry researchers.

My main reasons for opposing the motion:

  • Pseudo-code is just not enough: Anyone who has tried to implement an algorithm from another paper knows how terribly frustrating and time consuming it can be. With complex DL algorithms, every tiny detail matters: from hyperparameters to the randomness of the machine. It is another matter that this brittleness of DL is a huge cause of concern. See excellent talk by Joelle Pineau on reproducibility issues in reinforcement learning. In the current peer-review environment, it is nearly impossible to get a paper accepted unless all comparisons are made. I have personally had papers rejected even after we clearly stated that we could not reproduce the results of another paper.
  • Unfair to academic researchers: The cards are already stacked against academic researchers: they do not have access to vast compute and engineering resources. This is exasperated by the lack of reproducibility. It is grossly unfair to expect a graduate student to reproduce the results of a 100-person engineering team. It is critical to keep academia competitive: we are training the next generation and much of basic research still happens only in academia.
  • Accountability and fostering healthy environment: As AI gets deployed in the real world, we need to be responsible and accountable. We would not allow new medical drugs into the market without careful trials.  The same standards should apply to AI , especially in safety critical applications. It first starts with setting  rigorous standards for our research publications. Having accessible code allows the research community to extensively test the claims of the paper. Only then, it can be called legitimate science.
  • No incentives for voluntary release of code: Jessica Forde gave me some depressing statistics: currently only one third of the papers voluntarily release code. Many argue that making it compulsory is Draconian.  I will take Draconian any day if it ensures a fair environment that promotes honest progress.  There is also the broader issue that the current review system is broken: fair credit assignment is not ensured and false hype is unfairly rewarded. I am proud how the AI field, industry in particular, has embraced the culture of open sourcing.  This is arguably the single most important factor for rapid progress. There is incentive for industries to open source since it allows them to capture a user base. These incentives have a smaller effect on release of individual papers. It is therefore needed to enforce standards.
  • To increase synergistic impacts of the field: Counter-intuitively, code release will move the field away from leaderboard chasing. When code is readily available, barriers of entry for incremental research are lowered. Researchers are incentivized to do “deeper” investigation of the algorithms. Without this, we are  surely headed for the next AI winter. 

Countering the arguments that support the motion:

  • Cannot separate code from internal infrastructure: There exist (admittedly imperfect) solutions such as containerization. But this is a technical problem, and we are good at coming up with solutions for such well-defined problems.
  • Will drive away industry researchers and will slow down progress of AI: First of all,  progress of AI is not just dependent on industry researchers. Let us not have an “us vs. them” mentality. We need both industry and academia to make AI progress. I am personally happy if we can drive away researchers who are not ready to provide evidence for their claims. This will create a much healthier environment and will speed up progress.
  • Reproducibility is not enough: Certainly! But it is a great first step. As next steps, we need to ensure usable and modular code. We need abstractions that allows for easy repurposing of parts of the code. These are great technical challenges: ones our community is very well equipped to tackle.

Update from Twitter conversations

There was enthusiastic participation on Twitter. A summary below:


Useful tools for reproducibility:

Screen Shot 2019-01-28 at 4.45.56 PMScreen Shot 2019-01-28 at 4.42.29 PMScreen Shot 2019-01-28 at 4.50.25 PMScreen Shot 2019-01-28 at 5.07.10 PMScreen Shot 2019-01-28 at 4.51.04 PMScreen Shot 2019-01-28 at 4.42.03 PMScreen Shot 2019-01-28 at 5.17.36 PMScreen Shot 2019-01-28 at 5.07.24 PMLessons from other communities: 

Screen Shot 2019-01-28 at 5.18.37 PMScreen Shot 2019-01-28 at 5.18.58 PMScreen Shot 2019-01-28 at 4.46.25 PMIt is not just about code, but data, replication etc: 

Screen Shot 2019-01-28 at 5.16.42 PMScreen Shot 2019-01-28 at 5.08.23 PMScreen Shot 2019-01-28 at 5.17.56 PMScreen Shot 2019-01-28 at 5.20.26 PMDisagreements: 

Screen Shot 2019-01-28 at 5.22.31 PMScreen Shot 2019-01-28 at 5.23.24 PM

I assume that the Tweet above does not represent the official position of Deep mind, but I am not surprised.

I do not agree with the premise that it is a worthwhile exercise for others to reinvent the wheel, only to find out it is just vaporware. It is unfair to academia and unfair to graduate students whose careers depend on this.

I also find it ironic that the comment states that if an algorithm is so brittle to hyperparameters we should not trust these results. YES! That is the majority of deep RL results that are hyped up (and we know who the main culprit is).

What happens behind the doors: Even though there is overwhelming public support, I know that such efforts get thwarted in committee meetings of popular conferences like ICML and NeurIPS. We need to apply more pressure to have better accountability.

It is time to burst the bubble on hyped up AI vaporware with no supporting evidence. Let the true science begin!


2018 in Review

This post reviews my experiences in 2018. I welcomed the year in the gorgeous beaches of Goa and am now ending it in the wilderness of South Africa. My highlights of 2018 are the following:

Joining NVIDIA: I joined NVIDIA in September and started a new research group on core AI/ML. I am hiring at full pace and have started many new projects. I am also excited about many new launches from NVIDIA over the last few months:

  1. Rapids: Apache open-source multi-GPU ML library.
  2. Clara: Platform for medical imaging.
  3. Physx: Open source 3D simulation framework.

Honor of being the youngest named chair professor at CaltechI was one of the six faculty members that Caltech recognized during the 2017-18 academic year. This is the Institute’s most distinguished award for individual faculty.

Launching AWS Ground Truth: Before leaving AWS, I was working on the ground truth service which got launched during ReInvent conference in November. Data is a big barrier to adoption of AI. The availability of private workforce and not just the public crowd on Mturk will be a game changer in many applications. My team did the prototyping and many research projects on active learning, crowdsourcing and building intelligence into the data collection process.

Exciting research directions:

  1. Autonomous Systems: CAST at Caltech was launched in October 2017 to develop foundations for autonomy. This has been an exciting new area of research for me. We got a DARPA Physics of AI project funded that infuses physics into AI algorithms. The first paper to come out of this project has been the neural lander that uses neural networks to improve landing of drones while guaranteeing stability. Check out its videos here.
  2. AI4Science at Caltech: Along with Yuxin Chen and Yisong Yue, I launched AI4Science initiative at Caltech. The goal is to do truly integrated research that brings about new advances in many scientific domains. Some great use cases are high energy physics, earthquake detection, spinal cord therapy etc.
  3. Core ML research: We have pushed for a holistic view of AI as data + algorithms + systems.
    • Active learning and crowdsourcing for intelligent data gathering that significantly reduces data requirements.
    • Neural rendering model combines generation and prediction in a single model for semi-supervised learning of images.
    • SignSGD yields drastic gradient compression with almost no loss in accuracy.
    • Symbols + Numbers: Instead of indulging in pointless Twitter debates over which is better, can we just unify both? We combine symbolic expressions and numerical data in a common framework for neural programming.
    • Principled approaches in reinforcement learning: We develop efficient Bayesian DQN that improves exploration in high dimensions.  We derive new trust-region policy optimization for partially observable models with guaranteed monotonic improvement. We show negative results for combining model-based and model-free RL frameworks.
    • Domain adaptation: We derive generalization bounds when there are shifts in label distribution between source and target. This is applicable for AI cloud services where training distribution can have different proportions of categories from the serving distribution.
    • Tensorly: The open-source framework that allows you to write tensor algorithms in Python and choosing any of the backends: PyTorch, TensorFlow, NumPy or MxNet. It has many new features now and is now part of PyTorch ecosystem.

On academic job market: My graduating student Kamyar Azzizadenesheli has done ground-breaking work in reinforcement learning (some of which I outlined above). Hire him!

Having grandkids: academically speaking 😉 It is great to see my former student Furong Huang and my former postdoc Rose Yu thrive in their faculty careers.

Outreach and Democratization of AI: It has been very fulfilling to educate the public about AI around the world. I gave my first TEDx talkI shared the stage with so many luminaries such as his holiness Dalai Lama. It was special to speak to a large crowd of Chinese women entrepreneurs at the Mulan event.

2018 NYTimes GoodTech award: for raising awareness about diversity and inclusion. 2018 has been a defining year for me and for many #womeninTech. A large part of my energy went into fighting vicious sexism in our research communities. It is impossible to distill this into few sentences. I have had to fend off numerous pushbacks, trolls and threats. But the positive part has been truly uplifting: countless women have hugged me and said that I am speaking on their behalf. I have found numerous male allies who have pledged to fight sexism and racism.

I want to end the year in a positive light. I hope for a great 2019! I know it is not going to be easy, but I won’t give up. Stay strong and fight for what you truly believe in!

New beginnings @NVIDIA

I am very happy to share the news that I am joining NVIDIA as Director of Machine Learning Research. I will be based in the Santa Clara HQ and will be hiring ML researchers and engineers at all levels, along with graduate interns.

I will be continuing my role as Bren professor at Caltech and will be dividing my time between northern and southern California. I look forward to building strong intellectual relationships between NVIDIA and Caltech. There are many synergies with initiatives at Caltech such as the Center for Autonomous Systems (CAST) and AI4science.

I found NVIDIA to be a natural fit and it stood out among other opportunities. I chose NVIDIA because of its track record, its pivotal role in the deep-learning revolution, and the people I have interacted with. I will be reporting to Bill Dally, the chief scientist of NVIDIA. In addition to Bill, there is a rich history of academic researchers at NVIDIA such as Jan Kautz, Steve Keckler, Joel Emer, and recent hires Dieter Fox and Sanja Fidler. They have created a nourishing environment that blends research with strong engineering. I am looking forward to working with CEO Jensen Huang, whose vision for research I find inspiring.

The deep-learning revolution would not have happened without NVIDIA’s GPUs. The latest Volta GPUs pack an impressive 125 teraFLOPS and have fueled developments in diverse areas. The recently released NVIDIA Tesla T4 GPU is the world’s most advanced inference accelerator and NVIDIA GeForce represents the biggest leap in performance for graphics rendering since it is the world’s first real-time ray tracing GPU.

As many of you know, NVIDIA is much more than a hardware company. The development of CUDA libraries at NVIDIA has been a critical component for scaling up deep learning. The CUDA primitives are also relevant to my research on tensors. I worked with NVIDIA researcher Cris Cecka to build extended BLAS kernels for tensor contraction operations a few years ago. I look forward to building more support for tensor algebraic operations in CUDA which can lead to more efficient tensorized neural network architectures.

I admire recent ML research that has come out of NVIDIA. This includes state-of-art generative models for images and video, image denoising etc. The convergence of ML research with state-of-art hardware is happening at rapid pace at NVIDIA. In addition, I am also thrilled about developments in design and visualization, self-driving, IoT/autonomous systems and data center solutions at NVIDIA.

I hope to continue building bridges between academia and industry, and between theory and practice in my new role.