In today's video, Shantanu reviews the groundbreaking paper "CrossFormers: Scaling Cross-Embodied Learning," authored by researchers from UC Berkeley and Carnegie Mellon. CrossFormers introduce a transformer-based policy that can control a wide variety of robots across different tasks, including manipulation, navigation, locomotion, and even aviation.
Traditionally, robot learning requires specific policies for each robot and task. But CrossFormers break the mold by training on the largest, most diverse dataset ever—900,000 trajectories across 30 robot embodiments—allowing a single policy to handle everything from bimanual robots to quadcopters.
Key Highlights:
CrossFormers process a variety of sensor inputs, eliminating the need for manual alignment of action spaces.
The transformer-based architecture adapts to different robot embodiments and control frequencies, creating unmatched flexibility.
Tested across platforms like the WidowX BridgeV2, Unitree Go1, and Tello Quadcopter, CrossFormers outperformed specialist policies with a 73% success rate.
Check out the full review and learn how CrossFormers are shaping the future of generalist robot policies. Watch now and dive into cutting-edge robotic machine learning!
Start making your own machine learning models with an Aloha Kit
References:
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Comments