🏢 Ant Group

Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models

26 September 2024·2992 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Ant Group

PaRO boosts LLM training speed by up to 266% through refined model state partitioning and optimized communication.

On provable privacy vulnerabilities of graph representations

26 September 2024·4156 words·20 mins· loading · loading

AI Theory Privacy 🏢 Ant Group

Graph representation learning’s structural vulnerabilities are proven and mitigated via noisy aggregation, revealing crucial privacy-utility trade-offs.

Identify Then Recommend: Towards Unsupervised Group Recommendation

26 September 2024·1520 words·8 mins· loading · loading

Machine Learning Self-Supervised Learning 🏢 Ant Group

Unsupervised group recommendation model, ITR, achieves superior user and group recommendation accuracy by dynamically identifying user groups and employing self-supervised learning, eliminating the ne…

End-to-end Learnable Clustering for Intent Learning in Recommendation

26 September 2024·2462 words·12 mins· loading · loading

Machine Learning Recommendation Systems 🏢 Ant Group

ELCRec: a novel intent learning model for recommendation, unites behavior representation learning with end-to-end learnable clustering, achieving superior performance and scalability.

DeepITE: Designing Variational Graph Autoencoders for Intervention Target Estimation

26 September 2024·2107 words·10 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Ant Group

DeepITE: a novel variational graph autoencoder, efficiently estimates intervention targets from both labeled and unlabeled data, surpassing existing methods in recall and inference speed.

Collaborative Refining for Learning from Inaccurate Labels

26 September 2024·1752 words·9 mins· loading · loading

Machine Learning Deep Learning 🏢 Ant Group

Collaborative Refining for Learning from Inaccurate Labels (CRL) refines data using annotator agreement, improving model accuracy with noisy labels.

Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight

26 September 2024·2216 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Ant Group

Chain-of-Sight accelerates multimodal LLM pre-training by ~73% using a multi-scale visual resampling technique and a novel post-pretrain token scaling strategy, achieving comparable or superior perfor…

A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks

26 September 2024·2008 words·10 mins· loading · loading

Machine Learning Deep Learning 🏢 Ant Group

LNGD: A Layer-Wise Natural Gradient optimizer drastically cuts deep neural network training time without sacrificing accuracy.