Skip to main content

AI Applications

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
·2823 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Autonomous Vehicles 🏢 Huazhong University of Science & Technology
RAD: 3DGS-based RL advances autonomous driving, achieving a 3x lower collision rate!
Pre-training Auto-regressive Robotic Models with 4D Representations
·2752 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 UC Berkeley
ARM4R pre-trains autoregressive robotic models using low-level 4D representations from human videos, achieving efficient transfer learning and improved task performance across various environments.
Learning Getting-Up Policies for Real-World Humanoid Robots
·4423 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 University of Illinois Urbana-Champaign
HUMANUP: A novel two-stage reinforcement learning framework enables real-world humanoid robots to autonomously recover from falls on various terrains.
FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
·2535 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Finance 🏢 Harvard University
FLAG-TRADER fuses LLMs & RL for enhanced financial trading, achieving superior performance compared to traditional methods by efficiently integrating multimodal data and adapting to market dynamics.
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models
·6984 words·33 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Autonomous Vehicles 🏢 NVIDIA
V2V-LLM leverages multi-modal LLMs for safer cooperative autonomous driving by fusing perception data from multiple vehicles, answering driving-related questions, and improving trajectory planning.
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
·4451 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Tsinghua University
DexTrack achieves highly generalizable neural tracking control for dexterous robot manipulation by iteratively training a controller using high-quality demonstrations refined via homotopy optimization…
Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression
·3466 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 MIT
HMA: a novel approach for generating high-quality robotic videos 15x faster, enabling real-time policy evaluation and data augmentation for scaling robot learning.
Current Pathology Foundation Models are unrobust to Medical Center Differences
·2920 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Healthcare 🏢 Netherlands Cancer Institute Amsterdam
Current pathology foundation models struggle with center variations; this paper introduces a robustness index to quantify this, revealing model biases and advancing robust model development.
FAST: Efficient Action Tokenization for Vision-Language-Action Models
·4290 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 UC Berkeley
FAST: A novel action tokenization method using discrete cosine transform drastically improves autoregressive vision-language-action models’ training and performance, enabling dexterous and high-freque…
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
·3400 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 AgiBot
EnerVerse: A novel framework seamlessly integrates convolutional and attention mechanisms to generate embodied future spaces for enhanced robotic manipulation, mitigating data scarcity with a generati…
A3: Android Agent Arena for Mobile GUI Agents
·2276 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Human-AI Interaction 🏢 Hong Kong University of Science and Technology
Android Agent Arena (A3): A novel evaluation platform for mobile GUI agents offering diverse tasks, flexible action space, and automated LLM-based evaluation, advancing real-world AI agent research.
Training Software Engineering Agents and Verifiers with SWE-Gym
·3604 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 UC Berkeley
SWE-Gym, a novel environment for training real-world software engineering agents using 2,438 real-world Python task instances, achieves new state-of-the-art performance and is publicly available.
LearnLM: Improving Gemini for Learning
·4335 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Education 🏢 Google DeepMind
LearnLM enhances Gemini for education by training it to follow pedagogical instructions, leading to significant preference improvements over GPT-40, Claude 3.5, and Gemini 1.5 Pro in diverse learning …
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
·5162 words·25 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 MIT
MoDE makes AI for robot control faster and more efficient.
Large Action Models: From Inception to Implementation
·2938 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Microsoft
From language models to action models: building AI that does things.
TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning
·1675 words·8 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Princeton University
TidyBot++: Low-cost, open-source holonomic mobile base makes robot learning easier.
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
·3880 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Westlake University
CARP: A novel visuomotor policy learning paradigm achieves high accuracy and 10x faster inference than state-of-the-art by combining autoregressive efficiency and diffusion model precision through a c…
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
·3555 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 University of Hong Kong
Moto: Bridging language for robot manipulation using latent motion tokens, achieving superior performance with limited data.
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection
·6193 words·30 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Peking University
Code-as-Monitor (CaM) uses vision-language models and constraint-aware visual programming to achieve both reactive and proactive robotic failure detection in real-time, improving success rates and red…
WildLMa: Long Horizon Loco-Manipulation in the Wild
·2396 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 UC San Diego
WildLMa enables robots to perform complex, long-horizon manipulation tasks in unstructured environments by combining language-conditioned imitation learning, a whole-body controller for efficient tele…