Skip to main content

Posters

2024

Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
·2129 words·10 mins· loading · loading
Speech and Audio Speaker Recognition 🏒 Telecom Paris
Annealed Multiple Choice Learning (aMCL) overcomes limitations of Winner-takes-all in multiple choice learning by using annealing, improving robustness and performance.
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
·2384 words·12 mins· loading · loading
Computer Vision 3D Vision 🏒 DAMO Academy, Alibaba Group
Animate3D animates any 3D model using multi-view video diffusion, achieving superior spatiotemporal consistency and straightforward mesh animation.
Animal-Bench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding
·2713 words·13 mins· loading · loading
Multimodal Learning Multimodal Understanding 🏒 Beijing University of Posts and Telecommunications
Animal-Bench, a new benchmark, comprehensively evaluates multimodal video models for animal-centric video understanding, featuring 13 diverse tasks across 7 animal categories and 819 species.
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
·2347 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏒 EPFL, Switzerland
This study reveals that modifying optimizers to normalize updates based on angular changes and gradient signal-to-noise ratio significantly reduces the need for learning rate warmup in GPT training.
Analytically deriving Partial Information Decomposition for affine systems of stable and convolution-closed distributions
·1956 words·10 mins· loading · loading
AI Generated AI Theory Causality 🏒 Carnegie Mellon University
This paper presents novel theoretical results enabling the analytical calculation of Partial Information Decomposition for various probability distributions, including those relevant to neuroscience, …
Analysis of Corrected Graph Convolutions
·1907 words·9 mins· loading · loading
AI Generated Machine Learning Semi-Supervised Learning 🏒 Cheriton School of Computer Science, University of Waterloo
Corrected graph convolutions prevent oversmoothing and exponentially improve GNN classification accuracy.
Analysing the Generalisation and Reliability of Steering Vectors
·2935 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Department of Computer Science, University College London
Steering vectors, while promising for controlling LLMs, show unreliable in- and out-of-distribution performance, highlighting crucial limitations for real-world applications.
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
·2583 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Hong Kong University of Science and Technology
ANAH-v2 tackles LLM hallucination by introducing a self-training framework that iteratively scales annotation datasets and improves annotator accuracy, achieving state-of-the-art results.
An Offline Adaptation Framework for Constrained Multi-Objective Reinforcement Learning
·2513 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Sun Yat-Sen University
This work introduces PDOA, an offline adaptation framework for constrained multi-objective RL, using demonstrations instead of manually designed preferences to infer optimal policies while satisfying …
An Information Theoretic Perspective on Conformal Prediction
·3559 words·17 mins· loading · loading
AI Generated Machine Learning Federated Learning 🏒 Qualcomm AI Research
This paper uses information theory to improve conformal prediction, proving new ways to bound uncertainty and creating better training methods and side-information incorporation.
An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models
·2521 words·12 mins· loading · loading
AI Theory Representation Learning 🏒 School of Computing and Data Science, University of Hong Kong
Deep learning model interpretability improved via Sparse Rate Reduction (SRR), showing improved generalization and offering principled model design.
An Improved Empirical Fisher Approximation for Natural Gradient Descent
·2632 words·13 mins· loading · loading
Machine Learning Optimization 🏒 University of Cambridge
Improved Empirical Fisher (iEF) approximation significantly boosts the performance of Natural Gradient Descent (NGD) optimizers, offering superior convergence and generalization.
An Image is Worth 32 Tokens for Reconstruction and Generation
·2076 words·10 mins· loading · loading
Computer Vision Image Generation 🏒 ByteDance
Image generation gets a speed boost with TiTok, a novel 1D image tokenizer that uses just 32 tokens for high-quality image reconstruction and generation, achieving up to 410x faster processing than st…
An eye for an ear: zero-shot audio description leveraging an image captioner with audio-visual token distribution matching
·3208 words·16 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏒 LTCI, Télécom Paris, Institut Polytechnique De Paris
Leveraging vision-language models, this research introduces a novel unsupervised zero-shot audio captioning method that achieves state-of-the-art performance by aligning audio and image token distribu…
An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations
·3657 words·18 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 Peking University
EMDiffusion trains clean diffusion models from corrupted data using an expectation-maximization algorithm, achieving state-of-the-art results on diverse imaging tasks.
An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
·2679 words·13 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Oxford
A novel multilinear model analytically explains the emergence and scaling laws of skills in the multitask sparse parity problem, accurately predicting skill emergence in neural networks.
An Equivalence Between Static and Dynamic Regret Minimization
·321 words·2 mins· loading · loading
AI Generated AI Theory Optimization 🏒 Università Degli Studi Di Milano
Dynamic regret minimization equals static regret in an extended space; this equivalence reveals a trade-off between loss variance and comparator variability, leading to a new algorithm achieving impro…
An engine not a camera: Measuring performative power of online search
·2609 words·13 mins· loading · loading
AI Generated AI Theory Causality 🏒 Max Planck Institute for Intelligent Systems
New research quantifies how search engines steer web traffic by subtly changing results, offering a powerful method for antitrust investigations and digital market analysis.
An End-To-End Graph Attention Network Hashing for Cross-Modal Retrieval
·1722 words·9 mins· loading · loading
Multimodal Learning Cross-Modal Retrieval 🏒 Hebei Normal University
EGATH: End-to-End Graph Attention Network Hashing revolutionizes cross-modal retrieval by combining CLIP, transformers, and graph attention networks for superior semantic understanding and hash code g…
An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
·2754 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏒 State Key Laboratory of General Artificial Intelligence, BIGAI, Beijing, China
Extend LLMs context via a simple, training-efficient positional encoding method, CREAM, outperforming existing methods by focusing on crucial mid-context information.