Spotlight Others

Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space

26 September 2024·4386 words·21 mins· loading · loading

AI Applications Robotics 🏢 Delft University of Technology

Stable closed-loop control in latent space is achieved using a novel Coupled Oscillator Network, offering efficient model-based control for complex nonlinear systems directly from image data.

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness

26 September 2024·1771 words·9 mins· loading · loading

Meta Learning 🏢 University of Texas at Austin

Softmax attention in transformers adapts its attention window to function Lipschitzness and noise, enabling efficient in-context learning.

Improving the Worst-Case Bidirectional Communication Complexity for Nonconvex Distributed Optimization under Function Similarity

26 September 2024·1943 words·10 mins· loading · loading

Federated Learning 🏢 KAUST

MARINA-P and M3 algorithms drastically cut downlink and overall communication costs in nonconvex distributed optimization, scaling efficiently with the number of worker nodes.

Improving robustness to corruptions with multiplicative weight perturbations

26 September 2024·1713 words·9 mins· loading · loading

Image Classification 🏢 Aalto University

Boost DNN robustness to corruptions without sacrificing clean image accuracy using Data Augmentation via Multiplicative Perturbations (DAMP)!

Humanoid Locomotion as Next Token Prediction

26 September 2024·1485 words·7 mins· loading · loading

AI Applications Robotics 🏢 University of California, Berkeley

Humanoid robots now walk in San Francisco zero-shot, thanks to a novel ’next token prediction’ approach trained on diverse sensorimotor data, enabling real-world generalization and data efficiency.

Hardness of Learning Neural Networks under the Manifold Hypothesis

26 September 2024·2154 words·11 mins· loading · loading

🏢 Harvard University

Neural network learnability under the manifold hypothesis is hard except for efficiently sampleable manifolds.

Gradients of Functions of Large Matrices

26 September 2024·2389 words·12 mins· loading · loading

🏢 Technical University of Denmark

This research presents novel adjoint methods for efficiently differentiating Lanczos and Arnoldi iterations, unlocking accurate gradients for large-matrix functions in machine learning.

Geodesic Optimization for Predictive Shift Adaptation on EEG data

26 September 2024·2001 words·10 mins· loading · loading

Transfer Learning 🏢 Inria

GOPSA: a novel geodesic optimization method significantly improves cross-site age prediction from EEG data by jointly handling shifts in data and predictive variables.

Generative Retrieval Meets Multi-Graded Relevance

26 September 2024·2003 words·10 mins· loading · loading

Information Retrieval 🏢 University of Chinese Academy of Sciences

GR2, a novel framework, extends generative retrieval to handle multi-graded relevance, addressing limitations of existing binary-relevance approaches by enhancing docid distinctness and implementing m…

Generalized Protein Pocket Generation with Prior-Informed Flow Matching

26 September 2024·1970 words·10 mins· loading · loading

🏢 Harvard University

PocketFlow: a novel generative model designs high-affinity protein pockets using prior-informed flow matching, outperforming existing methods.

GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing

26 September 2024·2413 words·12 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Tsinghua University

GenArtist uses a multimodal large language model as an AI agent to unify image generation and editing, achieving state-of-the-art performance by decomposing complex tasks and leveraging a comprehensiv…

FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion

26 September 2024·2415 words·12 mins· loading · loading

Federated Learning 🏢 Hong Kong Baptist University

FuseFL achieves superior one-shot federated learning performance by leveraging a causal view of data heterogeneity and progressively fusing model blocks, significantly outperforming existing methods w…

FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference Images

26 September 2024·1953 words·10 mins· loading · loading

Image Generation 🏢 Shanghai Jiao Tong University

FuseAnyPart: Swap facial parts seamlessly using multiple reference images via diffusion, achieving high-fidelity results.

Flexible task abstractions emerge in linear networks with fast and bounded units

26 September 2024·3629 words·18 mins· loading · loading

🏢 Massachusetts Institute of Technology

Linear gated neural networks with fast, bounded units self-organize into modular weight structures and unique gating representations, enabling flexible task switching and compositional generalization.

Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts

26 September 2024·1871 words·9 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 University of North Carolina at Chapel Hill

Flex-MoE: A novel framework flexibly handles arbitrary modality combinations in multimodal learning, even with missing data, achieving robust performance.

Fine Tuning Out-of-Vocabulary Item Recommendation with User Sequence Imagination

26 September 2024·2088 words·10 mins· loading · loading

Recommendation Systems 🏢 Central South University

User Sequence Imagination (USIM) revolutionizes out-of-vocabulary item recommendation by leveraging user sequence imagination and RL fine-tuning, achieving superior performance in real-world e-commerc…

Finding Transformer Circuits With Edge Pruning

26 September 2024·2284 words·11 mins· loading · loading

Interpretability 🏢 Princeton University

Edge Pruning efficiently discovers sparse, yet accurate, computational subgraphs (circuits) in large language models via gradient-based edge pruning, advancing mechanistic interpretability research.

Fearless Stochasticity in Expectation Propagation

26 September 2024·1730 words·9 mins· loading · loading

🏢 University of Cambridge

This paper introduces EP-η and EP-μ, novel EP variants remarkably robust to Monte Carlo noise, achieving improved speed and accuracy.

Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting

26 September 2024·1899 words·9 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 University of Science and Technology of China

Frolic: A label-free framework boosts zero-shot vision model accuracy by learning prompt distributions and correcting label bias, achieving state-of-the-art performance across multiple datasets.

Enhancing LLM Reasoning via Vision-Augmented Prompting

26 September 2024·2157 words·11 mins· loading · loading

Multimodal Learning Multimodal Reasoning 🏢 Zhejiang University

Vision-Augmented Prompting (VAP) boosts LLM reasoning by automatically generating images from textual problem descriptions, incorporating visual-spatial clues to significantly improve accuracy across …