Caltech’s humanoid robot takes a stroll around campus. (Image: Academic Media Technologies/Caltech)

Introducing X1: The world's first multirobot system that integrates a humanoid robot with a transforming drone that can launch off the humanoid's back and, later, drive away. The versatile team can adjust its combined trajectories for the quickest route to the destination.

The new multimodal system is one product of a three-year collaboration between Caltech's Center for Autonomous Systems and Technologies (CAST) and the Technology Innovation Institute (TII) in Abu Dhabi, United Arab Emirates.

"Right now, robots can fly, robots can drive, and robots can walk. Those are all great in certain scenarios," said Aaron Ames, the director and Booth-Kresa Leadership Chair of CAST and the Bren Professor of Mechanical and Civil Engineering, Control and Dynamical Systems, and Aerospace at Caltech. "But how do we take those different locomotion modalities and put them together into a single package, so we can excel from the benefits of all these while mitigating the downfalls that each of them have?"

Testing the capability of the X1 system, the team recently conducted a demonstration on Caltech's campus. The demo was based on the following premise: Imagine that there is an emergency somewhere on campus, creating the need to quickly get autonomous agents to the scene. For the test, the team modified an off-the-shelf Unitree G1 humanoid such that it could carry M4, Caltech's multimodal robot that can both fly and drive, as if it were a backpack.

The demo started with the humanoid in Gates-Thomas Laboratory. It walked through Sherman Fairchild Library and went outside to an elevated spot where it could safely deploy M4. The humanoid then bent forward at the waist, allowing M4 to launch in its drone mode. M4 then landed and transformed into driving mode to efficiently continue on wheels toward its destination. Before reaching that destination, however, M4 encountered the Turtle Pond, so it switched back to drone mode, quickly flew over the obstacle, and made its way to the site of the "emergency" near Caltech Hall. The humanoid and a second M4 eventually met up with the first responder.

"The challenge is how to bring different robots to work together so, basically, they become one system providing different functionalities. With this collaboration, we found the perfect match to solve this," said Mory Gharib (PhD '83), the Hans W. Liepmann Professor of Aeronautics and Medical Engineering at Caltech and CAST's founding director.

Gharib's group, which originally built the M4 robot, focuses on building flying and driving robots as well as advanced control systems. The Ames lab, for its part, brings expertise in locomotion and developing algorithms for the safe use of humanoid robots. Meanwhile, TII brings a wealth of knowledge about autonomy and sensing with robotic systems in urban environments. A Northeastern University team led by engineer Alireza Ramezani assists in the area of morphing robot design.

"The overall collaboration atmosphere was great. We had different researchers with different skill sets looking at really challenging robotics problems spanning from perception and sensor data fusion to locomotion modeling and controls, to hardware design," said

Ramezani, an associate professor at Northeastern.

When TII engineers visited Caltech in July 2025, the partners built a new version of M4 that takes advantage of Saluki, a secure flight controller and computer technology developed by TII for onboard computing. In a future phase of work, the collaboration aims to give the entire system sensors, model-based algorithms, and machine learning-driven autonomy to navigate and adapt to its surroundings in real time.

"We install different kinds of sensors — lidar, cameras, range finders — and we combine all that data to understand where the robot is. The robot understands where it is in order to go from one point to another," said Claudio Tortorici, director of TII. "So, we provide the capability of the robots to move around with autonomy."

Ames explained that even more was on display in the demo than meets the eye. For example, he said, the humanoid robot did more than simply walking around campus. Currently, the majority of humanoid robots are given data originally captured from human movements to achieve a particular movement, such as walking or kicking, and scaling that action to the robot. If all goes well, the robot can imitate that action repeatedly. But, Ames argues, "If we want to really deploy robots in complicated scenarios in the real world, we need to be able to generate these actions without necessarily having human references."

His group builds mathematical models that describe the physics of the application to a robot more broadly. When these are fused with machine learning techniques, the models imbue robots with more general abilities to navigate any situation they might encounter. "The robot learns to walk as the physics dictate," Ames said. "So X1 can walk; it can walk on different terrain types; it can walk up and down stairs, and importantly, it can walk with things like M4 on its back."

An overarching goal of the collaboration is to make such autonomous systems safer and more reliable. "I believe we are at a stage where people are starting to accept these robots," Tortorici said. " In order to have robots all around us, we need them to be reliable."

That is ongoing work for the team. "We're thinking about safety-critical control, making sure we can trust our systems, making sure they're secure," Ames said. "We have multiple projects that extend beyond this one that study all the different facets of autonomy, and those problems are really big. By having these different projects and facets of our collaboration, we are able to take on the bigger problems and really move autonomy forward in a substantial and concerted way."

Source 



Transcript

00:00:00 [Music] So in this demo we had three key components. We had locomotion, flight and driving. When is the sum of the system greater than its parts? [Music] Historically my lab has done a lot of work with model and physics based methods for controlling robots. things like walking and locomotion purely using

00:00:30 physics based models to encode controllers. And the really important part about models is they're deterministic. You can get a certificate and in particular a proof, a mathematical proof, a theorem and a proof that your robot will do the right thing always. So how do we bring together different mobility types that robots have? So

00:00:49 right now robots can fly, robots can drive, and robots can walk. Those are all great in certain scenarios, but how do we take those different locomotion modalities and put them together into a single package so we can excel from all the benefits of these while mitigating the downfalls that each of them have. So the context is we want to get

00:01:12 somewhere on Caltech. Let's say there is an emergency and we want to get there quickly with an autonomous agent. So how are we going to do that? Well, we have M4 and we have the humanoid. So the humanoid needs to carry M4 to a deployment zone and the humanoid is very good at walking indoors and in buildings. So it first leaves Gates Thomas,

00:01:34 walks through the library, gets to an elevated area, the launching pad on the other side of the library. And at this point, we can now separate the team. And so the M4 takes off of the humanoid. switches to driving mode and then starts to head towards its goal. Now, when it's driving there, it

00:02:02 realizes there's a pond in the way. You can't drive over the pond. So, you transform to flight mode and head over the pond in flight mode. [Music] [Applause] [Music] So, we had two M4s. We'll call them red M4 and white M4. The red one already existed and that's

00:02:34 what we had been sort of practicing on. The white one is the one that was built collaboratively with TI in the couple weeks that they were here. So it was a brand new new M4 and that one has special things about it. For example, the compute that's happening is on something called the Saluki that TII developed from scratch themselves. So the Saluki is sitting on the M4 and the

00:02:55 M4 of course was designed at Caltech. So it really was a fusion of Caltech and TII together. [Applause] [Music] At the same time, the humanoid ultimately wants to end up at the same location. It's a little bit slower outside than a flying drone, but it will walk there and it will get there

00:03:13 eventually. So, it needs to handle some things like needs to walk around the long way, maybe take some stairs along the way. We had to realize a whole new form of locomotion on a humanoid robot that's never been realized before. If we want to really deploy robots in complicated scenarios, we need to be able to generate these actions without

00:03:39 necessarily having human references. And instead of using human data, take these models and and imbue the robots with the intelligence of the models using modern machine learning methods coupled with the models and fuse them together. And then we learn to walk as the physics dictate. So these are the areas where a humanoid can really excel handling terrain like

00:04:01 that. But ultimately it will get to that final location. They'll meet and together they can solve the problem together. So the goal was to show all those pieces working in totality in the demo. When you have an idea of something new and then you push it not just in your lab, but push it in a demo way. So we push it outdoors into real world

00:04:21 environments and stress test it and see how well it works. And when it actually works, it's pretty amazing. [Music]