Skip to main content

🏢 UC Los Angeles

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
·2337 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 UC Los Angeles
This paper introduces PointVid, a 3D-aware video generation framework using 3D point regularization to enhance video realism and address common issues like object morphing.
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
·2983 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Los Angeles
QLASS boosts language agent inference by using Q-values to guide a stepwise search, improving efficiency and performance even with limited data.
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
·5107 words·24 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Generation 🏢 UC Los Angeles
OmniFlow: a novel generative model masters any-to-any multi-modal generation, outperforming existing models and offering flexible control!