Skip to main content

Posters

2024

Information-theoretic Limits of Online Classification with Noisy Labels
·481 words·3 mins· loading · loading
AI Theory Optimization 🏒 CSOI, Purdue University
This paper unveils the information-theoretic limits of online classification with noisy labels, showing that the minimax risk is tightly characterized by the Hellinger gap of noisy label distributions…
Information-theoretic Generalization Analysis for Expected Calibration Error
·1937 words·10 mins· loading · loading
AI Theory Generalization 🏒 Osaka University
New theoretical analysis reveals optimal binning strategies for minimizing bias in expected calibration error (ECE), improving machine learning model calibration evaluation.
Information Re-Organization Improves Reasoning in Large Language Models
·2018 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Zhejiang University
InfoRE: A novel method improving large language models’ reasoning by reorganizing information to highlight logical relationships, resulting in a 4% average accuracy boost across various tasks.
InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling
·5629 words·27 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Wuhan University
InfoRM tackles reward hacking in RLHF using an information-theoretic approach, enhancing generalizability and enabling overoptimization detection.
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
·2045 words·10 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Tsinghua University
InfLLM: Training-free long-context extrapolation for LLMs via efficient context memory.
Inflationary Flows: Calibrated Bayesian Inference with Diffusion-Based Models
·3134 words·15 mins· loading · loading
Machine Learning Deep Learning 🏒 Duke University
Calibrated Bayesian inference achieved via novel diffusion models uniquely mapping high-dimensional data to lower-dimensional Gaussian distributions.
Infinite-Dimensional Feature Interaction
·1877 words·9 mins· loading · loading
Computer Vision Image Classification 🏒 Peking University
InfiNet achieves state-of-the-art results by enabling feature interaction in an infinite-dimensional space using RBF kernels, surpassing models limited to finite-dimensional interactions.
Infinite Limits of Multi-head Transformer Dynamics
·4731 words·23 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Harvard University
Researchers reveal how the training dynamics of transformer models behave at infinite width, depth, and head count, providing key insights for scaling up these models.
Inferring stochastic low-rank recurrent neural networks from neural data
·3178 words·15 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Tübingen, Germany
Researchers developed a method using variational sequential Monte Carlo to fit stochastic low-rank recurrent neural networks to neural data, enabling efficient analysis and generation of realistic neu…
Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors
·3586 words·17 mins· loading · loading
Computer Vision 3D Vision 🏒 Tsinghua University
This research presents LocalN2NM, a novel method for inferring neural signed distance functions (SDF) from single, noisy point clouds by finetuning data-driven priors, achieving faster inference and b…
Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
·1693 words·8 mins· loading · loading
AI Theory Representation Learning 🏒 Princeton University
Contrastive learning enables efficient probabilistic inference in high-dimensional time series by creating Gaussian representations that form a Gauss-Markov chain, allowing for closed-form solutions t…
Inference of Neural Dynamics Using Switching Recurrent Neural Networks
·2472 words·12 mins· loading · loading
Machine Learning Deep Learning 🏒 Yale University
SRNNs reveal behaviorally-relevant neural dynamics switches!
Inexact Augmented Lagrangian Methods for Conic Optimization: Quadratic Growth and Linear Convergence
·1589 words·8 mins· loading · loading
AI Theory Optimization 🏒 UC San Diego
This paper proves that inexact ALMs applied to SDPs achieve linear convergence for both primal and dual iterates, contingent solely on strict complementarity and a bounded solution set, thus resol…
Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models
·2218 words·11 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 University of Maryland
Injecting watermarks into LLM outputs while speeding up generation is impossible; this paper proves this trade-off and offers methods prioritizing either watermark strength or speed.
Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse
·3248 words·16 mins· loading · loading
AI Generated Machine Learning Transfer Learning 🏒 Columbia University
Multi-task learning and finetuning show surprising feature reuse biases, including a novel ’nested feature selection’ regime where finetuning prioritizes a sparse subset of pretrained features, signif…
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
·3011 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Salesforce Research
INDICT, a novel framework, empowers LLMs with internal dialogues of critiques to enhance code generation, prioritizing both safety and helpfulness, resulting in +10% absolute improvement across variou…
Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation
·2821 words·14 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Carnegie Mellon University
IsCiL: a novel adapter-based continual imitation learning framework that efficiently adapts to new tasks by incrementally learning and retrieving reusable skills.
Incorporating Test-Time Optimization into Training with Dual Networks for Human Mesh Recovery
·2718 words·13 mins· loading · loading
Computer Vision 3D Vision 🏒 South China University of Technology
Meta-learning enhances human mesh recovery by unifying training and test-time objectives, significantly improving accuracy and generalization.
Incorporating Surrogate Gradient Norm to Improve Offline Optimization Techniques
·2087 words·10 mins· loading · loading
AI Theory Optimization 🏒 Washington State University
IGNITE improves offline optimization by incorporating surrogate gradient norm to reduce model sharpness, boosting performance up to 9.6%
Incentivizing Quality Text Generation via Statistical Contracts
·1392 words·7 mins· loading · loading
Natural Language Processing Text Generation 🏒 Technion - Israel Institute of Technology
Cost-robust contracts, inspired by statistical hypothesis tests, incentivize quality in LLM text generation, overcoming the moral hazard of pay-per-token models.