Motion Control

How Time-Aware Control Makes Robots Faster, Gentler, and Smarter

Treating time as an explicit dimension of control enables robots to dynamically balance speed, precision, and robustness. By modulating how quickly time "elapses" internally, a single learned policy adapts seamlessly from fast, aggressive actions to slow, careful manipulation—no retraining required. The result is more efficient motion, improved sim-to-real performance, resilient recovery from disturbances, and intuitive high-level control over robot behavior, pointing toward a new generation of time-aware motion control.



Transcript

00:00:02 Time is the most fundamental dimension of intelligent behavior. When time is urgent, we prefer fast actions like throwing, sliding, or grasping while moving. When more time is available, we can perform deliberate actions like pouring, stacking, or closing quietly. Being punctual is essential in our daily activities and time serves as a crucial

00:00:30 signal that controls our behavior. [bell]
>> Oh my god, I'm late.
>> We propose time policy learning, a reinforcement learning framework that enables robots to explicitly reason and act with time. Traditionally, a time unaware robot policy only takes in task observations and optimizes solely for the task objective.

00:00:57 Our timeaware policy makes robots explicitly aware of time, ensuring they complete tasks both efficiently and punctually. We also introduce a time ratio parameter that controls how fast the robot internally perceives time elapsing. When a large time ratio is assigned internal time elapses faster,

00:01:23 encouraging the robot to act more efficiently. When a small time ratio is assigned internal time elapses slower, enabling more cautious robot behavior. Therefore, a single policy can dynamically adapt between rapid and precise execution, balancing efficiency, punctuality, and robustness. Time unaware policies typically optimize

00:01:48 only for accumulated reward. These policies often waste time on meaningless actions introduced by the complexity of reward shaping in the long horizon task. In contrast, timeaware policies dramatically improve efficiency by performing dynamic skills like throwing and sliding or executing multiple actions in parallel. In this pouring example, the time aware

00:02:17 policy initiates flow by swinging the cup during transport, shortening the overall pouring time. In this drawer opening task, it simultaneously closes the gripper and begins pulling, eliminating idle waiting after contact and improving manipulation efficiency. Another challenge in deploying time unaware policies to the real world is

00:02:46 the sim tore gap. In this example, the policy learned in simulation can stack cubes easily, but real world contact dynamics cause it to fail. Although we can slow the policy's motion through joint interpolation, this doesn't reduce the instability arising from object dynamics and the policy still doesn't know how to stack carefully.

00:03:09 In contrast, we can simply provide more time for the time policy, enabling more cautious behaviors. With more time available, the policy achieves more careful movements and generates lower manipulation noise. Timeaware policies also generalize to unseen scenarios. In the same pouring task, we increased the number of beans to 40.

00:03:39 While during training, the policy only saw scenarios with one bean. Both the time unaware policy and its joint interpolation variant fail to adapt to this change and pour aggressively remaining biased toward the training scenario. However, the time aware policy easily completes the task by pouring beans sequentially and gently.

00:04:05 In this drawer opening task, we added more weight in the drawer than during training. The time unaware policy always pulls aggressively causing controller oscillation and frequent interruptions. The time-aware policy in contrast generates gentle pulling actions while remaining punctual. Similarly, when given more time, the actions become even gentler.

00:04:40 Time awareness is also essential for multi- aent collaboration. Since time unaware policies cannot adjust their behavior based on another agents changes, they often fail by delivering either too early or too late. Timeare policies provide high adaptability and punctuality across various scheduled time changes. They can also report remaining

00:05:05 completion time to other agents for better planning. Moreover timeaware policies demonstrate strong resilience to random disturbances. They quickly recover and move faster to compensate for time loss during recovery. Here we show different human interventions during the delivery.

00:05:33 Finally, timeaware policies enable real-time behavior control. Instead of always performing manipulations at a constant speed, users can adjust the policy with huristic stage-wise control, acting more cautiously during contact witch stages like stacking while acting fast during other stages like approaching.

00:05:56 Multiple stage control can be easily achieved through this interface. In this example, the robot is precommanded to quickly grasp the handle while slowly pulling the drawer out to avoid controller oscillation and reduce manipulation noise. This control strategy introduces higher stability while maintaining punctuality. In some cases, the policy might exhibit

00:06:32 suboptimal behaviors. For example, at this moment, the beans got stuck due to overly gentle actions. Our online control interface allows users to intervene, making the policy pour harder to complete the task. This interface enables users to align the policy with their preferences or correct its behavior for new environments.

00:06:59 Another example here is we can command the robot to approach quickly while gently pulling at the beginning, then gradually increase the pulling speed until the task is completed. All of this control is achieved through a single time ratio command during test time without any fine-tuning. Users can adjust the robot's behavior through high-level commands. No

00:07:25 low-level teley operation needed. Overall, we believe time awareness isn't just a feature. It's a fundamental dimension of robot intelligence that brings us closer to truly adaptive collaborative machines. Thanks for watching.