↓Skip to main content

🏢 Beijing University of Posts and Telecommunications

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

13 February 2025·2116 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Beijing University of Posts and Telecommunications

MUDDFormer boosts Transformer performance by dynamically generating connection weights, improving cross-layer information flow and surpassing models trained with significantly more compute.

Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models

17 December 2024·5510 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Beijing University of Posts and Telecommunications

New benchmark reveals how well AI understands and meets real-world human needs.

Smaller Language Models Are Better Instruction Evolvers

15 December 2024·5507 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Beijing University of Posts and Telecommunications

Smaller is better: SLMs outperform LLMs in evolving complex & diverse instructions for AI training.