Reinforcement learning, a form of artificial intelligence, is used to adaptively learn how turbulent wind can change over time and then uses that knowledge to control a UAV based on what it is experiencing in real time. (Image: The researchers)
Tech Briefs: What inspired this research?

Animashree (Anima) Anandkumar: Our research was inspired by nature: Birds, fish, and other biological organisms can sense the fluids (air or water) around them, predict how that is likely to evolve, and change their movements accordingly. This gives them agility and fluidity even under challenging conditions. That is not seen in artificial flight — we wanted to change that.

Tech Briefs: What gave you the idea of using Fourier methods?

Anandkumar: My background is in Fourier methods, and I used it in my undergraduate thesis more than two decades ago. As deep learning took off, focus shifted away from traditional methods such as Fourier transforms. Instead, learning features directly from data yielded impressive results in areas such as computer vision. However, flight control is different: There isn’t a massive amount of data available, and control needs to occur in real time. Moreover, the underlying fluid dynamics has efficient representation in the spectral, or Fourier, domain. This inspired us to use Fourier methods in this work and combine that with reinforcement learning for online learning of turbulent conditions. We used the AI to learn the underlying model of turbulence so that it can take action based on how it thinks the wind will change.

Tech Briefs: What do you mean by “the underlying model?”

Anandkumar: I mean the underlying model of turbulence in the real world. In other words, we want to model how the wind gusts will change over time, and with that prediction, take the appropriate corrective action to stabilize the wing.

Tech Briefs: How does your method determine in advance when turbulence will occur?

Anandkumar: The Fourier transform offers an efficient, compact representation of the turbulence model, and, hence, can be learned quickly without needing a lot of data. We show this in our paper, where we compare our method, FALCON (Fourier Adaptive Learning and CONtrol), with model-free approaches, which require much more data to learn. Fourier is a good representation for learning since we can express wind motions as a combination of different frequencies. When extreme turbulence occurs, there is change of frequency, but there is also a buildup before that occurs. So, by learning, we can predict that change.

Tech Briefs: How is reinforcement learning used?

Anandkumar: Reinforcement learning is used in a model-based setting, meaning the model of how turbulence will evolve over time is learned online, and it is used to derive the best actions to stabilize the wing.

Tech Briefs: What are some of the challenges when training a reinforcement learning algorithm in a physical turbulent environment?

Anandkumar: Reinforcement learning is mostly known for enabling AI to play games such as Atari or Go. However, unlike games, where the number of possible moves in each step is a small discrete set, in flight control, the action space is continuous. Moreover, turbulent winds present a sudden and adverse change in environmental conditions. Without proper stabilization they can lead to flight crashes, in the worst case. Hence, this setting is safety- critical, which is not the typical setting for reinforcement learning in game-playing applications.

Tech Briefs: What are your next steps?

Anandkumar: In a recent work, we have developed a physics-informed reinforcement learning framework for drag reduction in turbulent flows. Unlike FALCON, where no physics constraints are employed, here we use the knowledge of turbulent flows in the form of partial differential equation (PDE) loss functions during learning. We see in this, the ability to generalize to new higher Reynolds numbers — higher turbulence — while in the FALCON work, we were limited to a fixed turbulent flow and did the learning on that. This new work shows the ability to adapt to new turbulent environments.

Tech Briefs: Do have any sense of when you might try this in an actual UAV or passenger plane?

Anandkumar: In various other works, we have done testing of machine learning methods on UAVs. We will soon be following up with incorporating FALCON and physics-informed approaches onto UAVs.

Tech Briefs: Is there anything you’d like to add?

Anandkumar: More broadly, I have been working on building AI for physical understanding. We developed Neural Operators, a general framework for learning multi-scale physical phenomena. In addition to learning fluid dynamics, it has been successful in a wide range of domains such as weather forecasting and plasma prediction in nuclear fusion, where it is 4-6 orders of magnitude faster than traditional approaches. We have also used it to design a medical catheter that led to a hundredfold reduction in bacterial contamination, as seen in physical experiments. My recent TED talk  gives an overview.