Posters

Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing

26 September 2024·2129 words·10 mins· loading · loading

Speech and Audio Speaker Recognition 🏢 Telecom Paris

Annealed Multiple Choice Learning (aMCL) overcomes limitations of Winner-takes-all in multiple choice learning by using annealing, improving robustness and performance.

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

26 September 2024·2384 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 DAMO Academy, Alibaba Group

Animate3D animates any 3D model using multi-view video diffusion, achieving superior spatiotemporal consistency and straightforward mesh animation.

Animal-Bench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding

26 September 2024·2713 words·13 mins· loading · loading

Multimodal Learning Multimodal Understanding 🏢 Beijing University of Posts and Telecommunications

Animal-Bench, a new benchmark, comprehensively evaluates multimodal video models for animal-centric video understanding, featuring 13 diverse tasks across 7 animal categories and 819 species.

Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training

26 September 2024·2347 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 EPFL, Switzerland

This study reveals that modifying optimizers to normalize updates based on angular changes and gradient signal-to-noise ratio significantly reduces the need for learning rate warmup in GPT training.

Analytically deriving Partial Information Decomposition for affine systems of stable and convolution-closed distributions

26 September 2024·1956 words·10 mins· loading · loading

AI Generated AI Theory Causality 🏢 Carnegie Mellon University

This paper presents novel theoretical results enabling the analytical calculation of Partial Information Decomposition for various probability distributions, including those relevant to neuroscience, …

Analysis of Corrected Graph Convolutions

26 September 2024·1907 words·9 mins· loading · loading

AI Generated Machine Learning Semi-Supervised Learning 🏢 Cheriton School of Computer Science, University of Waterloo

Corrected graph convolutions prevent oversmoothing and exponentially improve GNN classification accuracy.

Analysing the Generalisation and Reliability of Steering Vectors

26 September 2024·2935 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Department of Computer Science, University College London

Steering vectors, while promising for controlling LLMs, show unreliable in- and out-of-distribution performance, highlighting crucial limitations for real-world applications.

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

26 September 2024·2583 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

ANAH-v2 tackles LLM hallucination by introducing a self-training framework that iteratively scales annotation datasets and improves annotator accuracy, achieving state-of-the-art results.

An Offline Adaptation Framework for Constrained Multi-Objective Reinforcement Learning

26 September 2024·2513 words·12 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Sun Yat-Sen University

This work introduces PDOA, an offline adaptation framework for constrained multi-objective RL, using demonstrations instead of manually designed preferences to infer optimal policies while satisfying …

An Information Theoretic Perspective on Conformal Prediction

26 September 2024·3559 words·17 mins· loading · loading

AI Generated Machine Learning Federated Learning 🏢 Qualcomm AI Research

This paper uses information theory to improve conformal prediction, proving new ways to bound uncertainty and creating better training methods and side-information incorporation.

An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models

26 September 2024·2521 words·12 mins· loading · loading

AI Theory Representation Learning 🏢 School of Computing and Data Science, University of Hong Kong

Deep learning model interpretability improved via Sparse Rate Reduction (SRR), showing improved generalization and offering principled model design.

An Improved Empirical Fisher Approximation for Natural Gradient Descent

26 September 2024·2632 words·13 mins· loading · loading

Machine Learning Optimization 🏢 University of Cambridge

Improved Empirical Fisher (iEF) approximation significantly boosts the performance of Natural Gradient Descent (NGD) optimizers, offering superior convergence and generalization.

An Image is Worth 32 Tokens for Reconstruction and Generation

26 September 2024·2076 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 ByteDance

Image generation gets a speed boost with TiTok, a novel 1D image tokenizer that uses just 32 tokens for high-quality image reconstruction and generation, achieving up to 410x faster processing than st…

An eye for an ear: zero-shot audio description leveraging an image captioner with audio-visual token distribution matching

26 September 2024·3208 words·16 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 LTCI, Télécom Paris, Institut Polytechnique De Paris

Leveraging vision-language models, this research introduces a novel unsupervised zero-shot audio captioning method that achieves state-of-the-art performance by aligning audio and image token distribu…

An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations

26 September 2024·3657 words·18 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Peking University

EMDiffusion trains clean diffusion models from corrupted data using an expectation-maximization algorithm, achieving state-of-the-art results on diverse imaging tasks.

An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem

26 September 2024·2679 words·13 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Oxford

A novel multilinear model analytically explains the emergence and scaling laws of skills in the multitask sparse parity problem, accurately predicting skill emergence in neural networks.

An Equivalence Between Static and Dynamic Regret Minimization

26 September 2024·321 words·2 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Università Degli Studi Di Milano

Dynamic regret minimization equals static regret in an extended space; this equivalence reveals a trade-off between loss variance and comparator variability, leading to a new algorithm achieving impro…

An engine not a camera: Measuring performative power of online search

26 September 2024·2609 words·13 mins· loading · loading

AI Generated AI Theory Causality 🏢 Max Planck Institute for Intelligent Systems

New research quantifies how search engines steer web traffic by subtly changing results, offering a powerful method for antitrust investigations and digital market analysis.

An End-To-End Graph Attention Network Hashing for Cross-Modal Retrieval

26 September 2024·1722 words·9 mins· loading · loading

Multimodal Learning Cross-Modal Retrieval 🏢 Hebei Normal University

EGATH: End-to-End Graph Attention Network Hashing revolutionizes cross-modal retrieval by combining CLIP, transformers, and graph attention networks for superior semantic understanding and hash code g…

An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding

26 September 2024·2754 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 State Key Laboratory of General Artificial Intelligence, BIGAI, Beijing, China

Extend LLMs context via a simple, training-efficient positional encoding method, CREAM, outperforming existing methods by focusing on crucial mid-context information.