↓Skip to main content

🏢 UC Los Angeles

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

5 February 2025·2337 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 UC Los Angeles

This paper introduces PointVid, a 3D-aware video generation framework using 3D point regularization to enhance video realism and address common issues like object morphing.

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

4 February 2025·2983 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Los Angeles

QLASS boosts language agent inference by using Q-values to guide a stepwise search, improving efficiency and performance even with limited data.

OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

2 December 2024·5107 words·24 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Generation 🏢 UC Los Angeles

OmniFlow: a novel generative model masters any-to-any multi-modal generation, outperforming existing models and offering flexible control!