Skip to main content

🏢 Zhejiang University

On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
·308 words·2 mins· loading · loading
AI Theory Optimization 🏢 Zhejiang University
Adam optimizer achieves near-optimal convergence in non-convex scenarios with unbounded gradients and relaxed noise assumptions, improving its theoretical understanding and practical application.
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence
·2462 words·12 mins· loading · loading
Multimodal Learning Multimodal Generation 🏢 Zhejiang University
MoMu-Diffusion: a novel framework that learns long-term motion-music synchronization, generating realistic and beat-matched sequences surpassing existing methods.
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks
·2425 words·12 mins· loading · loading
Machine Learning Deep Learning 🏢 Zhejiang University
Model LEGO (MDA) revolutionizes deep learning by enabling the creation of new models by assembling and disassembling task-aware components from pre-trained models, eliminating the need for retraining.
MKGL: Mastery of a Three-Word Language
·2110 words·10 mins· loading · loading
Large Language Models 🏢 Zhejiang University
Researchers taught a large language model (LLM) a three-word ‘Knowledge Graph Language’ (KGL) to improve knowledge graph (KG) completion, drastically reducing errors compared to other methods.
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
·1847 words·9 mins· loading · loading
Computer Vision Image Generation 🏢 Zhejiang University
MimicTalk generates realistic, expressive talking videos in minutes using a pre-trained model adapted to individual identities.
MaskFactory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation
·1960 words·10 mins· loading · loading
Computer Vision Image Segmentation 🏢 Zhejiang University
MaskFactory generates high-quality synthetic data for dichotomous image segmentation, improving model training efficiency and accuracy.
Locating What You Need: Towards Adapting Diffusion Models to OOD Concepts In-the-Wild
·3829 words·18 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Zhejiang University
CATOD framework improves text-to-image generation by actively learning high-quality training data to accurately depict out-of-distribution concepts.
LG-CAV: Train Any Concept Activation Vector with Language Guidance
·3860 words·19 mins· loading · loading
AI Generated Computer Vision Vision-Language Models 🏢 Zhejiang University
LG-CAV: Train any Concept Activation Vector with Language Guidance, leverages vision-language models to train CAVs without labeled data, achieving superior accuracy and enabling state-of-the-art model…
Learning-Augmented Algorithms for the Bahncard Problem
·3280 words·16 mins· loading · loading
AI Theory Optimization 🏢 Zhejiang University
PFSUM, a novel learning-augmented algorithm, leverages short-term predictions to achieve superior performance in solving the Bahncard problem, outperforming existing methods with improved consistency …
Learning Complete Protein Representation by Dynamically Coupling of Sequence and Structure
·2792 words·14 mins· loading · loading
AI Generated Natural Language Processing Representation Learning 🏢 Zhejiang University
CoupleNet dynamically links protein sequences and structures for improved representations, surpassing state-of-the-art methods in function prediction, particularly for uncommon proteins.
Knowledge Circuits in Pretrained Transformers
·3083 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Zhejiang University
Researchers unveil ‘knowledge circuits’ within LLMs, revealing how knowledge is collaboratively encoded and utilized, leading to improved LLM design and interpretations of model behavior.
Information Re-Organization Improves Reasoning in Large Language Models
·2018 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Zhejiang University
InfoRE: A novel method improving large language models’ reasoning by reorganizing information to highlight logical relationships, resulting in a 4% average accuracy boost across various tasks.
Improved Regret for Bandit Convex Optimization with Delayed Feedback
·324 words·2 mins· loading · loading
AI Theory Optimization 🏢 Zhejiang University
A novel algorithm, D-FTBL, achieves improved regret bounds for bandit convex optimization with delayed feedback, tightly matching existing lower bounds in worst-case scenarios.
Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction
·2828 words·14 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Zhejiang University
DiMoP3D: Predicting diverse, physically realistic human motions in 3D scenes by harmonizing stochasticity and determinism.
Graph Diffusion Policy Optimization
·2821 words·14 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 Zhejiang University
GDPO: A novel method optimizes graph diffusion models for any objective using reinforcement learning, achieving state-of-the-art performance in diverse graph generation tasks.
Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching
·2541 words·12 mins· loading · loading
AI Generated Multimodal Learning Audio-Visual Learning 🏢 Zhejiang University
FRIEREN: a novel video-to-audio generation network using rectified flow matching achieves state-of-the-art performance by improving audio quality, temporal alignment, and generation efficiency.
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention
·1815 words·9 mins· loading · loading
Computer Vision Video Understanding 🏢 Zhejiang University
FreeLong: Generate high-fidelity long videos without retraining using spectral blending of global and local video features!
FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
·3370 words·16 mins· loading · loading
Machine Learning Federated Learning 🏢 Zhejiang University
FOOGD: A novel federated learning framework that simultaneously tackles out-of-distribution generalization and detection by estimating probability density for reliable global distribution guidance.
FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models
·2387 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Zhejiang University
FashionR2R leverages diffusion models to realistically translate rendered fashion images into photorealistic counterparts, enhancing realism and preserving fine-grained clothing textures.
Extracting Training Data from Molecular Pre-trained Models
·2322 words·11 mins· loading · loading
AI Generated AI Theory Privacy 🏢 Zhejiang University
Researchers reveal a high risk of training data extraction from molecular pre-trained models, challenging the assumption that model sharing alone adequately protects against data theft.