🏢 Zhejiang University
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
·308 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 Zhejiang University
Adam optimizer achieves near-optimal convergence in non-convex scenarios with unbounded gradients and relaxed noise assumptions, improving its theoretical understanding and practical application.
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence
·2462 words·12 mins·
loading
·
loading
Multimodal Learning
Multimodal Generation
🏢 Zhejiang University
MoMu-Diffusion: a novel framework that learns long-term motion-music synchronization, generating realistic and beat-matched sequences surpassing existing methods.
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks
·2425 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Zhejiang University
Model LEGO (MDA) revolutionizes deep learning by enabling the creation of new models by assembling and disassembling task-aware components from pre-trained models, eliminating the need for retraining.
MKGL: Mastery of a Three-Word Language
·2110 words·10 mins·
loading
·
loading
Large Language Models
🏢 Zhejiang University
Researchers taught a large language model (LLM) a three-word ‘Knowledge Graph Language’ (KGL) to improve knowledge graph (KG) completion, drastically reducing errors compared to other methods.
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
·1847 words·9 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Zhejiang University
MimicTalk generates realistic, expressive talking videos in minutes using a pre-trained model adapted to individual identities.
MaskFactory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation
·1960 words·10 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Zhejiang University
MaskFactory generates high-quality synthetic data for dichotomous image segmentation, improving model training efficiency and accuracy.
Locating What You Need: Towards Adapting Diffusion Models to OOD Concepts In-the-Wild
·3829 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Zhejiang University
CATOD framework improves text-to-image generation by actively learning high-quality training data to accurately depict out-of-distribution concepts.
LG-CAV: Train Any Concept Activation Vector with Language Guidance
·3860 words·19 mins·
loading
·
loading
AI Generated
Computer Vision
Vision-Language Models
🏢 Zhejiang University
LG-CAV: Train any Concept Activation Vector with Language Guidance, leverages vision-language models to train CAVs without labeled data, achieving superior accuracy and enabling state-of-the-art model…
Learning-Augmented Algorithms for the Bahncard Problem
·3280 words·16 mins·
loading
·
loading
AI Theory
Optimization
🏢 Zhejiang University
PFSUM, a novel learning-augmented algorithm, leverages short-term predictions to achieve superior performance in solving the Bahncard problem, outperforming existing methods with improved consistency …
Learning Complete Protein Representation by Dynamically Coupling of Sequence and Structure
·2792 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Representation Learning
🏢 Zhejiang University
CoupleNet dynamically links protein sequences and structures for improved representations, surpassing state-of-the-art methods in function prediction, particularly for uncommon proteins.
Knowledge Circuits in Pretrained Transformers
·3083 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Zhejiang University
Researchers unveil ‘knowledge circuits’ within LLMs, revealing how knowledge is collaboratively encoded and utilized, leading to improved LLM design and interpretations of model behavior.
Information Re-Organization Improves Reasoning in Large Language Models
·2018 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Zhejiang University
InfoRE: A novel method improving large language models’ reasoning by reorganizing information to highlight logical relationships, resulting in a 4% average accuracy boost across various tasks.
Improved Regret for Bandit Convex Optimization with Delayed Feedback
·324 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 Zhejiang University
A novel algorithm, D-FTBL, achieves improved regret bounds for bandit convex optimization with delayed feedback, tightly matching existing lower bounds in worst-case scenarios.
Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction
·2828 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Zhejiang University
DiMoP3D: Predicting diverse, physically realistic human motions in 3D scenes by harmonizing stochasticity and determinism.
Graph Diffusion Policy Optimization
·2821 words·14 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Zhejiang University
GDPO: A novel method optimizes graph diffusion models for any objective using reinforcement learning, achieving state-of-the-art performance in diverse graph generation tasks.
Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching
·2541 words·12 mins·
loading
·
loading
AI Generated
Multimodal Learning
Audio-Visual Learning
🏢 Zhejiang University
FRIEREN: a novel video-to-audio generation network using rectified flow matching achieves state-of-the-art performance by improving audio quality, temporal alignment, and generation efficiency.
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention
·1815 words·9 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 Zhejiang University
FreeLong: Generate high-fidelity long videos without retraining using spectral blending of global and local video features!
FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
·3370 words·16 mins·
loading
·
loading
Machine Learning
Federated Learning
🏢 Zhejiang University
FOOGD: A novel federated learning framework that simultaneously tackles out-of-distribution generalization and detection by estimating probability density for reliable global distribution guidance.
FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models
·2387 words·12 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Zhejiang University
FashionR2R leverages diffusion models to realistically translate rendered fashion images into photorealistic counterparts, enhancing realism and preserving fine-grained clothing textures.
Extracting Training Data from Molecular Pre-trained Models
·2322 words·11 mins·
loading
·
loading
AI Generated
AI Theory
Privacy
🏢 Zhejiang University
Researchers reveal a high risk of training data extraction from molecular pre-trained models, challenging the assumption that model sharing alone adequately protects against data theft.