Posters

Information-theoretic Limits of Online Classification with Noisy Labels

26 September 2024·481 words·3 mins· loading · loading

AI Theory Optimization 🏢 CSOI, Purdue University

This paper unveils the information-theoretic limits of online classification with noisy labels, showing that the minimax risk is tightly characterized by the Hellinger gap of noisy label distributions…

Information-theoretic Generalization Analysis for Expected Calibration Error

26 September 2024·1937 words·10 mins· loading · loading

AI Theory Generalization 🏢 Osaka University

New theoretical analysis reveals optimal binning strategies for minimizing bias in expected calibration error (ECE), improving machine learning model calibration evaluation.

Information Re-Organization Improves Reasoning in Large Language Models

26 September 2024·2018 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Zhejiang University

InfoRE: A novel method improving large language models’ reasoning by reorganizing information to highlight logical relationships, resulting in a 4% average accuracy boost across various tasks.

InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling

26 September 2024·5629 words·27 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Wuhan University

InfoRM tackles reward hacking in RLHF using an information-theoretic approach, enhancing generalizability and enabling overoptimization detection.

InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory

26 September 2024·2045 words·10 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

InfLLM: Training-free long-context extrapolation for LLMs via efficient context memory.

Inflationary Flows: Calibrated Bayesian Inference with Diffusion-Based Models

26 September 2024·3134 words·15 mins· loading · loading

Machine Learning Deep Learning 🏢 Duke University

Calibrated Bayesian inference achieved via novel diffusion models uniquely mapping high-dimensional data to lower-dimensional Gaussian distributions.

Infinite-Dimensional Feature Interaction

26 September 2024·1877 words·9 mins· loading · loading

Computer Vision Image Classification 🏢 Peking University

InfiNet achieves state-of-the-art results by enabling feature interaction in an infinite-dimensional space using RBF kernels, surpassing models limited to finite-dimensional interactions.

Infinite Limits of Multi-head Transformer Dynamics

26 September 2024·4731 words·23 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Harvard University

Researchers reveal how the training dynamics of transformer models behave at infinite width, depth, and head count, providing key insights for scaling up these models.

Inferring stochastic low-rank recurrent neural networks from neural data

26 September 2024·3178 words·15 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Tübingen, Germany

Researchers developed a method using variational sequential Monte Carlo to fit stochastic low-rank recurrent neural networks to neural data, enabling efficient analysis and generation of realistic neu…

Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors

26 September 2024·3586 words·17 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

This research presents LocalN2NM, a novel method for inferring neural signed distance functions (SDF) from single, noisy point clouds by finetuning data-driven priors, achieving faster inference and b…

Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

26 September 2024·1693 words·8 mins· loading · loading

AI Theory Representation Learning 🏢 Princeton University

Contrastive learning enables efficient probabilistic inference in high-dimensional time series by creating Gaussian representations that form a Gauss-Markov chain, allowing for closed-form solutions t…

Inference of Neural Dynamics Using Switching Recurrent Neural Networks

26 September 2024·2472 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Yale University

SRNNs reveal behaviorally-relevant neural dynamics switches!

Inexact Augmented Lagrangian Methods for Conic Optimization: Quadratic Growth and Linear Convergence

26 September 2024·1589 words·8 mins· loading · loading

AI Theory Optimization 🏢 UC San Diego

This paper proves that inexact ALMs applied to SDPs achieve linear convergence for both primal and dual iterates, contingent solely on strict complementarity and a bounded solution set, thus resol…

Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models

26 September 2024·2218 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

Injecting watermarks into LLM outputs while speeding up generation is impossible; this paper proves this trade-off and offers methods prioritizing either watermark strength or speed.

Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse

26 September 2024·3248 words·16 mins· loading · loading

AI Generated Machine Learning Transfer Learning 🏢 Columbia University

Multi-task learning and finetuning show surprising feature reuse biases, including a novel ’nested feature selection’ regime where finetuning prioritizes a sparse subset of pretrained features, signif…

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness

26 September 2024·3011 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Salesforce Research

INDICT, a novel framework, empowers LLMs with internal dialogues of critiques to enhance code generation, prioritizing both safety and helpfulness, resulting in +10% absolute improvement across variou…

Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation

26 September 2024·2821 words·14 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Carnegie Mellon University

IsCiL: a novel adapter-based continual imitation learning framework that efficiently adapts to new tasks by incrementally learning and retrieving reusable skills.

Incorporating Test-Time Optimization into Training with Dual Networks for Human Mesh Recovery

26 September 2024·2718 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 South China University of Technology

Meta-learning enhances human mesh recovery by unifying training and test-time objectives, significantly improving accuracy and generalization.

Incorporating Surrogate Gradient Norm to Improve Offline Optimization Techniques

26 September 2024·2087 words·10 mins· loading · loading

AI Theory Optimization 🏢 Washington State University

IGNITE improves offline optimization by incorporating surrogate gradient norm to reduce model sharpness, boosting performance up to 9.6%

Incentivizing Quality Text Generation via Statistical Contracts

26 September 2024·1392 words·7 mins· loading · loading

Natural Language Processing Text Generation 🏢 Technion - Israel Institute of Technology

Cost-robust contracts, inspired by statistical hypothesis tests, incentivize quality in LLM text generation, overcoming the moral hazard of pay-per-token models.