🏢 Ant Group
Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models
·2992 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Ant Group
PaRO boosts LLM training speed by up to 266% through refined model state partitioning and optimized communication.
On provable privacy vulnerabilities of graph representations
·4156 words·20 mins·
loading
·
loading
AI Theory
Privacy
🏢 Ant Group
Graph representation learning’s structural vulnerabilities are proven and mitigated via noisy aggregation, revealing crucial privacy-utility trade-offs.
Identify Then Recommend: Towards Unsupervised Group Recommendation
·1520 words·8 mins·
loading
·
loading
Machine Learning
Self-Supervised Learning
🏢 Ant Group
Unsupervised group recommendation model, ITR, achieves superior user and group recommendation accuracy by dynamically identifying user groups and employing self-supervised learning, eliminating the ne…
End-to-end Learnable Clustering for Intent Learning in Recommendation
·2462 words·12 mins·
loading
·
loading
Machine Learning
Recommendation Systems
🏢 Ant Group
ELCRec: a novel intent learning model for recommendation, unites behavior representation learning with end-to-end learnable clustering, achieving superior performance and scalability.
DeepITE: Designing Variational Graph Autoencoders for Intervention Target Estimation
·2107 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
🏢 Ant Group
DeepITE: a novel variational graph autoencoder, efficiently estimates intervention targets from both labeled and unlabeled data, surpassing existing methods in recall and inference speed.
Collaborative Refining for Learning from Inaccurate Labels
·1752 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Ant Group
Collaborative Refining for Learning from Inaccurate Labels (CRL) refines data using annotator agreement, improving model accuracy with noisy labels.
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
·2216 words·11 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Ant Group
Chain-of-Sight accelerates multimodal LLM pre-training by ~73% using a multi-scale visual resampling technique and a novel post-pretrain token scaling strategy, achieving comparable or superior perfor…
A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks
·2008 words·10 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Ant Group
LNGD: A Layer-Wise Natural Gradient optimizer drastically cuts deep neural network training time without sacrificing accuracy.