Posters
2024
Where Do Large Learning Rates Lead Us?
·5231 words·25 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 Constructor University
Unlocking optimal neural network training: A narrow range of initially high learning rates, slightly above the convergence threshold, consistently yields superior generalization after fine-tuning.
When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
·2699 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Amsterdam
RLHF’s reliance on fully observable environments is challenged: human feedback, often partial, leads to deceptive AI behavior (inflation & overjustification).
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
·2003 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 ETH Zurich
TACOS: A novel time-adaptive RL framework drastically reduces interactions in continuous-time systems while improving performance, offering both model-free and model-based algorithms.
When to Act and When to Ask: Policy Learning With Deferral Under Hidden Confounding
·1568 words·8 mins·
loading
·
loading
AI Theory
Causality
🏢 Faculty of Data and Decision Sciences, Technion
CARED: a novel causal action recommendation model improves policy learning by collaborating with human experts and mitigating hidden confounding in observational data.
When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
·2980 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Purdue University
RLbreaker uses deep reinforcement learning to efficiently create highly effective jailbreaking prompts, outperforming existing methods against multiple state-of-the-art LLMs and defenses.
When is Multicalibration Post-Processing Necessary?
·10662 words·51 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
🏢 University of Southern California
Multicalibration post-processing isn’t always necessary; models often implicitly achieve it, especially calibrated ones. For uncalibrated models, though, it significantly improves fairness.
When is an Embedding Model More Promising than Another?
·4115 words·20 mins·
loading
·
loading
AI Theory
Representation Learning
🏢 Mila - Quebec AI Institute
This paper introduces a novel, task-agnostic method for ranking embedding models using information sufficiency, a concept derived from communication theory and statistical experiments comparison, demo…
When does perceptual alignment benefit vision representations?
·4058 words·20 mins·
loading
·
loading
AI Generated
Computer Vision
Representation Learning
🏢 MIT
Aligning vision models to human perceptual similarity judgments significantly boosts performance in diverse vision tasks like counting and segmentation, but surprisingly reduces performance in natural…
When are dynamical systems learned from time series data statistically accurate?
·2869 words·14 mins·
loading
·
loading
AI Theory
Generalization
🏢 University of Chicago
Learned dynamical systems often fail to capture true physical behavior; this work introduces an ergodic theoretic approach to improve statistical accuracy by incorporating Jacobian information during …
What Variables Affect Out-of-Distribution Generalization in Pretrained Models?
·4187 words·20 mins·
loading
·
loading
Computer Vision
Representation Learning
🏢 Rochester Institute of Technology
High-resolution datasets with diverse classes significantly improve the transferability of pretrained DNNs by reducing representation compression and mitigating the ’tunnel effect.'
What Rotary Position Embedding Can Tell Us: Identifying Query and Key Weights Corresponding to Basic Syntactic or High-level Semantic Information
·1978 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
LLM fine-tuning made easy! This paper reveals how analyzing weight vector angles in RoPE positional embeddings helps optimize LLMs, reducing parameter count and improving efficiency.
What matters when building vision-language models?
·2924 words·14 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Hugging Face
Idefics2, a new 8B-parameter VLM, achieves state-of-the-art performance, closing the gap with much larger models by meticulously analyzing design choices and training methods.
What Matters in Graph Class Incremental Learning? An Information Preservation Perspective
·3421 words·17 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 College of Intelligence and Computing, Tianjin University
GSIP framework mitigates catastrophic forgetting in graph class incremental learning by preserving crucial graph information, achieving a 10% improvement in forgetting metrics.
What makes unlearning hard and what to do about it
·5453 words·26 mins·
loading
·
loading
AI Theory
Interpretability
🏢 University of Warwick
Researchers developed RUM, a refined unlearning meta-algorithm, that significantly improves existing unlearning methods by strategically refining forget sets and employing appropriate unlearning algor…
What Makes Partial-Label Learning Algorithms Effective?
·1666 words·8 mins·
loading
·
loading
Machine Learning
Semi-Supervised Learning
🏢 Southeast University
Unlocking Partial-Label Learning: A new study reveals surprisingly simple design principles for highly accurate algorithms, dramatically simplifying future research and boosting performance.
What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
·3275 words·16 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 University of Hong Kong
CLIP’s robustness to long-tailed pre-training data stems from its dynamic classification task and descriptive language supervision, offering transferable insights for improving model generalizability.
What Makes and Breaks Safety Fine-tuning? A Mechanistic Study
·10141 words·48 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Oxford
Safety fine-tuning for LLMs is shown to minimally transform weights, clustering inputs based on safety, but is easily bypassed by adversarial attacks.
What is my quantum computer good for? Quantum capability learning with physics-aware neural networks
·1734 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Sandia National Laboratories
Quantum-physics-aware neural networks achieve up to 50% improved accuracy in predicting quantum computer capabilities, scaling to 100+ qubits.
What Is Missing For Graph Homophily? Disentangling Graph Homophily For Graph Neural Networks
·2555 words·12 mins·
loading
·
loading
AI Generated
AI Theory
Representation Learning
🏢 Nanyang Technological University
Tri-Hom disentangles graph homophily into label, structural, and feature aspects, providing a more comprehensive and accurate metric for predicting GNN performance.
What If the Input is Expanded in OOD Detection?
·3779 words·18 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Wuhan University
Boost OOD detection accuracy by averaging model confidence scores from original and corrupted inputs!