๐ข Ant Group
Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models
ยท2992 wordsยท15 minsยท
loading
ยท
loading
Natural Language Processing
Large Language Models
๐ข Ant Group
PaRO boosts LLM training speed by up to 266% through refined model state partitioning and optimized communication.
On provable privacy vulnerabilities of graph representations
ยท4156 wordsยท20 minsยท
loading
ยท
loading
AI Theory
Privacy
๐ข Ant Group
Graph representation learningโs structural vulnerabilities are proven and mitigated via noisy aggregation, revealing crucial privacy-utility trade-offs.
Identify Then Recommend: Towards Unsupervised Group Recommendation
ยท1520 wordsยท8 minsยท
loading
ยท
loading
Machine Learning
Self-Supervised Learning
๐ข Ant Group
Unsupervised group recommendation model, ITR, achieves superior user and group recommendation accuracy by dynamically identifying user groups and employing self-supervised learning, eliminating the neโฆ
End-to-end Learnable Clustering for Intent Learning in Recommendation
ยท2462 wordsยท12 minsยท
loading
ยท
loading
Machine Learning
Recommendation Systems
๐ข Ant Group
ELCRec: a novel intent learning model for recommendation, unites behavior representation learning with end-to-end learnable clustering, achieving superior performance and scalability.
DeepITE: Designing Variational Graph Autoencoders for Intervention Target Estimation
ยท2107 wordsยท10 minsยท
loading
ยท
loading
AI Generated
Machine Learning
Deep Learning
๐ข Ant Group
DeepITE: a novel variational graph autoencoder, efficiently estimates intervention targets from both labeled and unlabeled data, surpassing existing methods in recall and inference speed.
Collaborative Refining for Learning from Inaccurate Labels
ยท1752 wordsยท9 minsยท
loading
ยท
loading
Machine Learning
Deep Learning
๐ข Ant Group
Collaborative Refining for Learning from Inaccurate Labels (CRL) refines data using annotator agreement, improving model accuracy with noisy labels.
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
ยท2216 wordsยท11 minsยท
loading
ยท
loading
Multimodal Learning
Vision-Language Models
๐ข Ant Group
Chain-of-Sight accelerates multimodal LLM pre-training by ~73% using a multi-scale visual resampling technique and a novel post-pretrain token scaling strategy, achieving comparable or superior perforโฆ
A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks
ยท2008 wordsยท10 minsยท
loading
ยท
loading
Machine Learning
Deep Learning
๐ข Ant Group
LNGD: A Layer-Wise Natural Gradient optimizer drastically cuts deep neural network training time without sacrificing accuracy.