🏢 Beijing University of Posts and Telecommunications
MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections
·2116 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Beijing University of Posts and Telecommunications
MUDDFormer boosts Transformer performance by dynamically generating connection weights, improving cross-layer information flow and surpassing models trained with significantly more compute.
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models
·5510 words·26 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Beijing University of Posts and Telecommunications
New benchmark reveals how well AI understands and meets real-world human needs.
Smaller Language Models Are Better Instruction Evolvers
·5507 words·26 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Beijing University of Posts and Telecommunications
Smaller is better: SLMs outperform LLMs in evolving complex & diverse instructions for AI training.