Posters
2024
CALVIN: Improved Contextual Video Captioning via Instruction Tuning
·2746 words·13 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Meta AI
CALVIN: Instruction tuning boosts contextual video captioning, achieving state-of-the-art results!
Calibrating Reasoning in Language Models with Internal Consistency
·2546 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Shanghai Jiao Tong University
LLMs’ reasoning can be improved by using internal consistency to calibrate their outputs.
Calibrated Self-Rewarding Vision Language Models
·2260 words·11 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 UNC Chapel Hill
Calibrated Self-Rewarding (CSR) significantly improves vision-language models by using a novel iterative approach that incorporates visual constraints into the self-rewarding process, reducing halluci…
CALANet: Cheap All-Layer Aggregation for Human Activity Recognition
·2545 words·12 mins·
loading
·
loading
AI Generated
AI Applications
Healthcare
🏢 School of Computer Science and Engineering, Chung-Ang University
CALANet: Cheap All-Layer Aggregation boosts real-time HAR accuracy by efficiently aggregating features from all layers, achieving state-of-the-art performance on seven benchmark datasets.
Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment
·2104 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Artificial Intelligence Research Laboratory, Pennsylvania State University
Cal-DPO calibrates implicit rewards in contrastive preference learning, dramatically improving large language model alignment with human preferences.
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
·2545 words·12 mins·
loading
·
loading
Speech and Audio
Speech Recognition
🏢 Johns Hopkins University
CA-SSLR: a novel self-supervised learning model dynamically adapts to various speech tasks by integrating language and speaker embeddings, improving performance and reducing reliance on audio features…
C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory
·1787 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Tsinghua University
C-GAIL stabilizes Generative Adversarial Imitation Learning by applying control theory, resulting in faster convergence, reduced oscillation, and better expert policy matching.
Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences
·1936 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Federated Learning
🏢 King Abdullah University of Science and Technology
Byzantine-tolerant Variance-Reduced MARINA with Partial Participation (Byz-VR-MARINA-PP) is the first distributed method to simultaneously achieve Byzantine robustness and partial client participation…
Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers
·2873 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 CLAIRE, EPFL
Training large language models efficiently is key; this paper shows how using structured feedforward layers and a novel training regime significantly reduces computational costs and improves training …
Building a stable classifier with the inflated argmax
·2014 words·10 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
🏢 Department of Statistics, University of Chicago
Boost classifier stability with the novel inflated argmax, guaranteeing reliable multiclass classification without distributional assumptions!
Bridging the Divide: Reconsidering Softmax and Linear Attention
·2335 words·11 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Tsinghua University
InLine attention, a novel method, bridges the performance gap between softmax and linear attention by incorporating injectivity and local modeling, achieving superior performance while maintaining lin…
Bridging semantics and pragmatics in information-theoretic emergent communication
·1593 words·8 mins·
loading
·
loading
Natural Language Processing
Dialogue Systems
🏢 Apple
AI agents learn human-like communication, combining semantic categorization and pragmatic context-sensitive reasoning, through a novel information-theoretic framework.
Bridging OOD Detection and Generalization: A Graph-Theoretic View
·2436 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 University of Illinois Urbana-Champaign
A novel graph-theoretic framework bridges OOD detection & generalization, offering theoretical error bounds and competitive empirical performance.
Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift
·1648 words·8 mins·
loading
·
loading
AI Theory
Generalization
🏢 Tsinghua University
New model-agnostic framework for out-of-distribution generalization uses multicalibration across overlapping groups, showing improved robustness and prediction under various distribution shifts.
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models
·2078 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Genentech
BRAID: A novel, conservative fine-tuning method surpasses offline design optimization by cleverly combining generative diffusion models with reward models, preventing over-optimization and generating …
Bridging Geometric States via Geometric Diffusion Bridge
·1526 words·8 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Peking University
Geometric Diffusion Bridge (GDB) accurately predicts geometric state evolution in complex systems by leveraging a probabilistic approach and equivariant diffusion processes, surpassing existing deep l…
Bridging Gaps: Federated Multi-View Clustering in Heterogeneous Hybrid Views
·2330 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Federated Learning
🏢 School of Computer Science and Engineering, University of Electronic Science and Technology of China
FMCSC: A novel federated multi-view clustering framework bridging client and view gaps in heterogeneous hybrid views, achieving superior performance through local-synergistic contrastive learning and …
Bridge-IF: Learning Inverse Protein Folding with Markov Bridges
·1691 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Zhejiang University
Bridge-IF, a novel generative diffusion model, excels at inverse protein folding by learning probabilistic dependencies between protein structures and sequences, significantly outperforming existing m…
Bridge the Modality and Capability Gaps in Vision-Language Model Selection
·3390 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 State Key Laboratory for Novel Software Technology, Nanjing University
SWAB bridges modality and capability gaps in Vision-Language Model selection using optimal transport, enabling accurate prediction of VLM performance without images.
Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack
·4173 words·20 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 School of Data Science,The Chinese University of Hong Kong
Researchers discover that existing backdoor defenses leave vulnerabilities, allowing for easy re-activation of backdoors through subtle trigger manipulation.