Skip to main content

🏢 Northwestern Polytechnical University

Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration
·3716 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Northwestern Polytechnical University
FiCoCo: A unified paradigm accelerates Multimodal Large Language Model (MLLM) inference by up to 82.4% with minimal performance loss, surpassing state-of-the-art training-free methods.
Material Anything: Generating Materials for Any 3D Object via Diffusion
·4056 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Northwestern Polytechnical University
Material Anything: Generate realistic materials for ANY 3D object via diffusion!