🏢 UC Berkeley
FAST: Efficient Action Tokenization for Vision-Language-Action Models
·4290 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC Berkeley
FAST: A novel action tokenization method using discrete cosine transform drastically improves autoregressive vision-language-action models’ training and performance, enabling dexterous and high-freque…
An Empirical Study of Autoregressive Pre-training from Videos
·5733 words·27 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 UC Berkeley
Toto, a new autoregressive video model, achieves competitive performance across various benchmarks by pre-training on over 1 trillion visual tokens, demonstrating the effectiveness of scaling video mo…
Training Software Engineering Agents and Verifiers with SWE-Gym
·3604 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC Berkeley
SWE-Gym, a novel environment for training real-world software engineering agents using 2,438 real-world Python task instances, achieves new state-of-the-art performance and is publicly available.
Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment
·2984 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Robotics
🏢 UC Berkeley
RAPL efficiently aligns robots with human preferences using minimal feedback by aligning visual representations before reward learning.
Predicting Emergent Capabilities by Finetuning
·6002 words·29 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 UC Berkeley
Predicting emergent LLM capabilities is now possible by finetuning smaller models; this approach shifts the emergence point, enabling accurate predictions of future model performance, even with up to …