Skip to main content

🏢 NVIDIA

FFN Fusion: Rethinking Sequential Computation in Large Language Models
·3776 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NVIDIA
FFN Fusion: Parallelizing sequential computation in large language models for significant speedups!
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
·4257 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 NVIDIA
Cosmos-Transfer1: An adaptable conditional world generation model using multimodal control.
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
·4040 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Embodied AI 🏢 NVIDIA
Cosmos-Reason1: Physical AI models that reason and act in the real world, bridging the gap between perception and embodied decision-making.
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
·3137 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 NVIDIA
SANA-Sprint: An efficient diffusion model for ultra-fast text-to-image generation with continuous-time consistency distillation, achieving state-of-the-art performance in speed and quality.
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models
·2982 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 NVIDIA
DIFIX3D+ improves 3D reconstructions by reducing artifacts via single-step diffusion models, enhancing novel-view synthesis quality and consistency.
One-step Diffusion Models with $f$-Divergence Distribution Matching
·6126 words·29 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 NVIDIA
f-distill: One-step diffusion models through f-divergence minimization, outperforming reverse-KL with better mode coverage and lower variance.
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models
·6984 words·33 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Autonomous Vehicles 🏢 NVIDIA
V2V-LLM leverages multi-modal LLMs for safer cooperative autonomous driving by fusing perception data from multiple vehicles, answering driving-related questions, and improving trajectory planning.
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
·4300 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 NVIDIA
VLSI: Verbalized Layers-to-Interactions efficiently transfers knowledge from large to small VLMs using layer-wise natural language distillation, achieving significant performance gains without scaling…
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
·4724 words·23 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NVIDIA
Puzzle: a novel framework accelerates large language model inference by using neural architecture search and knowledge distillation, achieving a 2.17x speedup on a single GPU while preserving 98.4% ac…
Star Attention: Efficient LLM Inference over Long Sequences
·5535 words·26 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NVIDIA
Star Attention: 11x faster LLM inference on long sequences with 95-100% accuracy!
Hymba: A Hybrid-head Architecture for Small Language Models
·4219 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NVIDIA
Hymba: Hybrid-head architecture boosts small language model performance by 11.67x cache size reduction and 3.49x throughput, surpassing existing models.