Skip to main content

Posters

2024

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model
·1886 words·9 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Peking University
MaVEn: A novel multi-granularity hybrid visual encoding framework significantly boosts MLLM’s multi-image reasoning capabilities by combining discrete and continuous visual representations.
Matryoshka Query Transformer for Large Vision-Language Models
·1913 words·9 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 UC Los Angeles
Matryoshka Query Transformer (MQT) empowers large vision-language models with flexible visual token encoding, drastically reducing inference costs while maintaining high accuracy across multiple bench…
MatrixNet: Learning over symmetry groups using learned group representations
·1841 words·9 mins· loading · loading
AI Theory Representation Learning 🏢 Northeastern University
MatrixNet learns efficient group representations for improved deep learning on symmetry groups, achieving higher sample efficiency and generalization than existing methods.
Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods
·1782 words·9 mins· loading · loading
AI Generated AI Theory Optimization 🏢 Institute of Science and Technology Austria
Optimal matrix denoising with doubly heteroscedastic noise achieved!
MatFormer: Nested Transformer for Elastic Inference
·3341 words·16 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Texas at Austin
MatFormer: Train one universal model, extract hundreds of accurate submodels for elastic inference!
Matching the Statistical Query Lower Bound for $k$-Sparse Parity Problems with Sign Stochastic Gradient Descent
·2323 words·11 mins· loading · loading
AI Generated AI Theory Optimization 🏢 UC Los Angeles
Sign Stochastic Gradient Descent (SGD) achieves optimal sample complexity for solving k-sparse parity problems, matching Statistical Query lower bounds.
MaskFactory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation
·1960 words·10 mins· loading · loading
Computer Vision Image Segmentation 🏢 Zhejiang University
MaskFactory generates high-quality synthetic data for dichotomous image segmentation, improving model training efficiency and accuracy.
Masked Pre-training Enables Universal Zero-shot Denoiser
·4914 words·24 mins· loading · loading
Computer Vision Image Generation 🏢 University of Science and Technology of China
Masked Pre-training empowers a universal, fast zero-shot image denoiser!
Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages
·1697 words·8 mins· loading · loading
AI Generated Natural Language Processing AI Theory 🏢 University of Notre Dame
Masked hard-attention transformers, with strict masking, precisely capture star-free languages, matching the expressive power of linear temporal logic.
Marrying Causal Representation Learning with Dynamical Systems for Science
·3100 words·15 mins· loading · loading
AI Generated AI Theory Representation Learning 🏢 Institute of Science and Technology Austria
This study marries causal representation learning with dynamical systems to enable parameter identification in real-world scientific data, unlocking downstream causal analysis for various applications…
Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows
·2056 words·10 mins· loading · loading
Machine Learning Deep Learning 🏢 Lancaster University
Adaptive MCMC with CNFs accelerates probabilistic inference by combining local and flow-informed transition kernels, achieving state-of-the-art results efficiently.
Markov Equivalence and Consistency in Differentiable Structure Learning
·2350 words·12 mins· loading · loading
AI Theory Causality 🏢 Carnegie Mellon University
Researchers developed a new, differentiable score function for learning causal relationships from data that reliably recovers the simplest causal model, even with complex data.
Marginal Causal Flows for Validation and Inference
·1827 words·9 mins· loading · loading
AI Theory Causality 🏢 University of Oxford
Frugal Flows: Generate realistic causal benchmarks with exact marginal causal effects, enabling robust causal method validation.
Many-shot Jailbreaking
·5721 words·27 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Anthropic
Long-context attacks easily manipulate LLMs by feeding hundreds of harmful examples, highlighting a critical vulnerability amplified by larger context windows.
ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation
·2484 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Valeo.ai
ManiPose: Manifold-constrained multi-hypothesis model solves 3D human pose estimation’s depth ambiguity, outperforming state-of-the-art models in pose consistency.
MAmmoTH2: Scaling Instructions from the Web
·2418 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Carnegie Mellon University
MAmmoTH2: Harvesting 10M web instructions for enhanced LLM reasoning!
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
·2671 words·13 mins· loading · loading
AI Generated Multimodal Learning Human-AI Interaction 🏢 Tsinghua University
MambaTalk: Efficient holistic gesture synthesis using selective state space models to overcome computational complexity and improve gesture quality.
MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging
·3150 words·15 mins· loading · loading
AI Generated Computer Vision Video Understanding 🏢 Harbin Institute of Technology (Shenzhen)
MambaSCI: Efficient, novel deep learning model reconstructs high-quality quad-Bayer video from compressed snapshots, surpassing existing methods.
MambaLRP: Explaining Selective State Space Sequence Models
·3148 words·15 mins· loading · loading
AI Theory Interpretability 🏢 Google DeepMind
MambaLRP enhances explainability of Mamba sequence models by ensuring faithful relevance propagation, achieving state-of-the-art explanation performance, and uncovering model biases.
MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space
·2330 words·11 mins· loading · loading
Computer Vision Image Enhancement 🏢 Nanjing University of Science and Technology
MambaLLIE: a novel implicit Retinex-aware low-light enhancer using a global-then-local state space, significantly outperforms existing CNN and Transformer-based methods.