Smarter Motion Starts with Time Awareness

Time has long been embedded in trajectory generation and control loops, but typically remains implicit at the decision-making level. This work introduces time-aware policies that make timing an explicit part of high-level behavior, allowing robots to adjust how motions are executed—from fast and efficient to slow and precise—without changing the underlying controller. The result is improved coordination, robustness, and task efficiency. For motion control practitioners, it highlights an emerging layer where timing is no longer just scheduled, but actively reasoned about.



Transcript

00:00:02 Time is the most fundamental dimension of intelligent behavior. When time is urgent, we prefer fast actions like throwing, sliding, or grasping while moving. When more time is available, we can perform deliberate actions like pouring, stacking, or closing quietly. Being punctual is essential in our daily activities and time serves as a crucial

00:00:30 signal that controls our behavior.
>> Oh my god, I'm late.
>> We propose time policy learning, a reinforcement learning framework that enables robots to explicitly reason and act with time. Traditionally, a time unaware robot policy only takes in task observations and optimizes solely for the task objective. Our time aware policy makes robots

00:01:03 explicitly aware of time, ensuring they complete tasks both efficiently and punctually. We also introduce a time ratio parameter that controls how fast the robot internally perceives time elapsing. When a large time ratio is assigned, internal time elapses faster, encouraging the robot to act more efficiently. When a small time ratio is assigned, internal time elapses slower,

00:01:31 enabling more cautious robot behavior. Therefore, a single policy can dynamically adapt between rapid and precise execution, balancing efficiency, punctuality, and robustness. Time unaware policies typically optimize only for accumulated reward. These policies often waste time on meaningless actions introduced by the complexity of reward shaping in the long horizon task.

00:02:00 In contrast, timeaware policies dramatically improve efficiency by performing dynamic skills like throwing and sliding or executing multiple actions in parallel. In this pouring example, the time aware policy initiates flow by swinging the cup during transport, shortening the overall pouring time. In this drawer opening task, it

00:02:32 simultaneously closes the gripper and begins pulling, eliminating idle waiting after contact and improving manipulation efficiency. Another challenge in deploying time unaware policies to the real world is the sim tore gap. In this example, the policy learned in simulation can stack cubes easily, but real world contact dynamics cause it to fail. Although we

00:02:57 can slow the policy's motion through joint interpolation, this doesn't reduce the instability arising from object dynamics and the policy still doesn't know how to stack carefully. In contrast, we can simply provide more time for the time policy enabling more cautious behaviors. With more time available, the policy achieves more careful movements and

00:03:21 generates lower manipulation noise. Timeaware policies also generalize to unseen scenarios. In the same pouring task, we increased the number of beans to 40. While during training, the policy only saw scenarios with one bean. Both the time unaware policy and its joint interpolation variant fail to adapt to this change and pour aggressively remaining biased

00:03:52 toward the training scenario. However, the time policy easily completes the task by pouring beans sequentially and gently. In this drawer opening task, we added more weight in the drawer than during training. The time unaware policy always pulls aggressively causing controller oscillation and frequent interruptions. The time policy in contrast generates

00:04:21 gentle pulling actions while remaining punctual. Similarly, when given more time, the actions become even gentler. Time awareness is also essential for multi- aent collaboration. Since time unaware policies cannot adjust their behavior based on another agents changes, they often fail by delivering either too early or too late.

00:04:56 Timeare policies provide high adaptability and punctuality across various scheduled time changes. They can also report remaining completion time to other agents for better planning. Moreover, timeaware policies demonstrate strong resilience to random disturbances. They quickly recover and move faster to compensate for time loss

00:05:18 during recovery. Here we show different human interventions during the delivery. Finally, timeaware policies enable real-time behavior control. Instead of always performing manipulations at a constant speed, users can adjust the policy with huristic stage-wise control, acting more cautiously during contact witch stages like stacking while acting

00:05:50 fast during other stages like approaching. Multiple stage control can be easily achieved through this interface. In this example, the robot is precommanded to quickly grasp the handle while slowly pulling the drawer out to avoid controller oscillation and reduce manipulation noise. This control strategy introduces higher stability

00:06:19 while maintaining punctuality. In some cases, the policy might exhibit suboptimal behaviors. For example, at this moment, the beans got stuck due to overly gentle actions. Our online control interface allows users to intervene, making the policy pour harder to complete the task. This interface enables users to align the policy with their preferences or correct

00:06:54 its behavior for new environments. Another example here is we can command the robot to approach quickly while gently pulling at the beginning, then gradually increase the pulling speed until the task is completed. All of this control is achieved through a single time ratio command during test time without any fine-tuning. Users can adjust the robot's behavior through

00:07:23 high-level commands. No low-level teley operation needed. Overall, we believe time awareness isn't just a feature. It's a fundamental dimension of robot intelligence that brings us closer to truly adaptive, collaborative machines. Thanks for watching.