Posters
2024
Information-theoretic Limits of Online Classification with Noisy Labels
·481 words·3 mins·
loading
·
loading
AI Theory
Optimization
π’ CSOI, Purdue University
This paper unveils the information-theoretic limits of online classification with noisy labels, showing that the minimax risk is tightly characterized by the Hellinger gap of noisy label distributions…
Information-theoretic Generalization Analysis for Expected Calibration Error
·1937 words·10 mins·
loading
·
loading
AI Theory
Generalization
π’ Osaka University
New theoretical analysis reveals optimal binning strategies for minimizing bias in expected calibration error (ECE), improving machine learning model calibration evaluation.
Information Re-Organization Improves Reasoning in Large Language Models
·2018 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Zhejiang University
InfoRE: A novel method improving large language models’ reasoning by reorganizing information to highlight logical relationships, resulting in a 4% average accuracy boost across various tasks.
InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling
·5629 words·27 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Wuhan University
InfoRM tackles reward hacking in RLHF using an information-theoretic approach, enhancing generalizability and enabling overoptimization detection.
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
·2045 words·10 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Tsinghua University
InfLLM: Training-free long-context extrapolation for LLMs via efficient context memory.
Inflationary Flows: Calibrated Bayesian Inference with Diffusion-Based Models
·3134 words·15 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Duke University
Calibrated Bayesian inference achieved via novel diffusion models uniquely mapping high-dimensional data to lower-dimensional Gaussian distributions.
Infinite-Dimensional Feature Interaction
·1877 words·9 mins·
loading
·
loading
Computer Vision
Image Classification
π’ Peking University
InfiNet achieves state-of-the-art results by enabling feature interaction in an infinite-dimensional space using RBF kernels, surpassing models limited to finite-dimensional interactions.
Infinite Limits of Multi-head Transformer Dynamics
·4731 words·23 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Harvard University
Researchers reveal how the training dynamics of transformer models behave at infinite width, depth, and head count, providing key insights for scaling up these models.
Inferring stochastic low-rank recurrent neural networks from neural data
·3178 words·15 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ University of TΓΌbingen, Germany
Researchers developed a method using variational sequential Monte Carlo to fit stochastic low-rank recurrent neural networks to neural data, enabling efficient analysis and generation of realistic neu…
Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors
·3586 words·17 mins·
loading
·
loading
Computer Vision
3D Vision
π’ Tsinghua University
This research presents LocalN2NM, a novel method for inferring neural signed distance functions (SDF) from single, noisy point clouds by finetuning data-driven priors, achieving faster inference and b…
Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
·1693 words·8 mins·
loading
·
loading
AI Theory
Representation Learning
π’ Princeton University
Contrastive learning enables efficient probabilistic inference in high-dimensional time series by creating Gaussian representations that form a Gauss-Markov chain, allowing for closed-form solutions t…
Inference of Neural Dynamics Using Switching Recurrent Neural Networks
·2472 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Yale University
SRNNs reveal behaviorally-relevant neural dynamics switches!
Inexact Augmented Lagrangian Methods for Conic Optimization: Quadratic Growth and Linear Convergence
·1589 words·8 mins·
loading
·
loading
AI Theory
Optimization
π’ UC San Diego
This paper proves that inexact ALMs applied to SDPs achieve linear convergence for both primal and dual iterates, contingent solely on strict complementarity and a bounded solution set, thus resol…
Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models
·2218 words·11 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ University of Maryland
Injecting watermarks into LLM outputs while speeding up generation is impossible; this paper proves this trade-off and offers methods prioritizing either watermark strength or speed.
Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse
·3248 words·16 mins·
loading
·
loading
AI Generated
Machine Learning
Transfer Learning
π’ Columbia University
Multi-task learning and finetuning show surprising feature reuse biases, including a novel ’nested feature selection’ regime where finetuning prioritizes a sparse subset of pretrained features, signif…
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
·3011 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Salesforce Research
INDICT, a novel framework, empowers LLMs with internal dialogues of critiques to enhance code generation, prioritizing both safety and helpfulness, resulting in +10% absolute improvement across variou…
Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation
·2821 words·14 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Carnegie Mellon University
IsCiL: a novel adapter-based continual imitation learning framework that efficiently adapts to new tasks by incrementally learning and retrieving reusable skills.
Incorporating Test-Time Optimization into Training with Dual Networks for Human Mesh Recovery
·2718 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
π’ South China University of Technology
Meta-learning enhances human mesh recovery by unifying training and test-time objectives, significantly improving accuracy and generalization.
Incorporating Surrogate Gradient Norm to Improve Offline Optimization Techniques
·2087 words·10 mins·
loading
·
loading
AI Theory
Optimization
π’ Washington State University
IGNITE improves offline optimization by incorporating surrogate gradient norm to reduce model sharpness, boosting performance up to 9.6%
Incentivizing Quality Text Generation via Statistical Contracts
·1392 words·7 mins·
loading
·
loading
Natural Language Processing
Text Generation
π’ Technion - Israel Institute of Technology
Cost-robust contracts, inspired by statistical hypothesis tests, incentivize quality in LLM text generation, overcoming the moral hazard of pay-per-token models.