Robotics
Pre-training Auto-regressive Robotic Models with 4D Representations
·2752 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC Berkeley
ARM4R pre-trains autoregressive robotic models using low-level 4D representations from human videos, achieving efficient transfer learning and improved task performance across various environments.
Learning Getting-Up Policies for Real-World Humanoid Robots
·4423 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 University of Illinois Urbana-Champaign
HUMANUP: A novel two-stage reinforcement learning framework enables real-world humanoid robots to autonomously recover from falls on various terrains.
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
·4451 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Tsinghua University
DexTrack achieves highly generalizable neural tracking control for dexterous robot manipulation by iteratively training a controller using high-quality demonstrations refined via homotopy optimization…
Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression
·3466 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 MIT
HMA: a novel approach for generating high-quality robotic videos 15x faster, enabling real-time policy evaluation and data augmentation for scaling robot learning.
FAST: Efficient Action Tokenization for Vision-Language-Action Models
·4290 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC Berkeley
FAST: A novel action tokenization method using discrete cosine transform drastically improves autoregressive vision-language-action models’ training and performance, enabling dexterous and high-freque…
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
·3400 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 AgiBot
EnerVerse: A novel framework seamlessly integrates convolutional and attention mechanisms to generate embodied future spaces for enhanced robotic manipulation, mitigating data scarcity with a generati…
Training Software Engineering Agents and Verifiers with SWE-Gym
·3604 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC Berkeley
SWE-Gym, a novel environment for training real-world software engineering agents using 2,438 real-world Python task instances, achieves new state-of-the-art performance and is publicly available.
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
·5162 words·25 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 MIT
MoDE makes AI for robot control faster and more efficient.
Large Action Models: From Inception to Implementation
·2938 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Microsoft
From language models to action models: building AI that does things.
TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning
·1675 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Princeton University
TidyBot++: Low-cost, open-source holonomic mobile base makes robot learning easier.
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
·3880 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Westlake University
CARP: A novel visuomotor policy learning paradigm achieves high accuracy and 10x faster inference than state-of-the-art by combining autoregressive efficiency and diffusion model precision through a c…
Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment
·2984 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Robotics
🏢 UC Berkeley
RAPL efficiently aligns robots with human preferences using minimal feedback by aligning visual representations before reward learning.
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
·3555 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 University of Hong Kong
Moto: Bridging language for robot manipulation using latent motion tokens, achieving superior performance with limited data.
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection
·6193 words·30 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Peking University
Code-as-Monitor (CaM) uses vision-language models and constraint-aware visual programming to achieve both reactive and proactive robotic failure detection in real-time, improving success rates and red…
WildLMa: Long Horizon Loco-Manipulation in the Wild
·2396 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC San Diego
WildLMa enables robots to perform complex, long-horizon manipulation tasks in unstructured environments by combining language-conditioned imitation learning, a whole-body controller for efficient tele…
Soft Robotic Dynamic In-Hand Pen Spinning
·2419 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Carnegie Mellon University
SWIFT, a new system, enables a soft robotic hand to learn dynamic pen spinning via real-world trial-and-error, achieving 100% success across diverse pen properties without explicit object modeling.
DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation
·2203 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 New York University
DynaMem empowers robots with online dynamic spatio-semantic memory, achieving a 2x improvement in pick-and-drop success rate on non-stationary objects compared to static systems.
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
·3111 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 Tsinghua University
DeeR-VLA dynamically adjusts the size of a multimodal large language model based on task difficulty, significantly reducing computational cost and memory usage in robotic control without compromising …