Spotlight Others
2024
Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting
·1663 words·8 mins·
loading
·
loading
π’ Hong Kong University of Science and Technology
SSCNN, a novel decomposition-based model, achieves superior long-term time series forecasting accuracy using 99% fewer parameters than existing methods, proving that bigger isn’t always better.
Parameter-Inverted Image Pyramid Networks
·2381 words·12 mins·
loading
·
loading
Object Detection
π’ Tsinghua University
Parameter-Inverted Image Pyramid Networks (PIIP) boost image pyramid efficiency by using smaller models for higher-resolution images and larger models for lower-resolution ones, achieving superior per…
Parallel Backpropagation for Shared-Feature Visualization
·1538 words·8 mins·
loading
·
loading
Visual Question Answering
π’ Hertie Institute, University Clinics TΓΌbingen
Researchers visualized shared visual features driving responses of body-selective neurons to non-body objects, revealing object parts resembling macaque body parts, thus explaining neural preferences.
PACE: marrying the generalization of PArameter-efficient fine-tuning with Consistency rEgularization
·2876 words·14 mins·
loading
·
loading
π’ Australian National University
PACE marries parameter-efficient fine-tuning with consistency regularization to significantly boost model generalization.
Overcoming Common Flaws in the Evaluation of Selective Classification Systems
·1630 words·8 mins·
loading
·
loading
π’ German Cancer Research Center
Researchers developed a new evaluation metric, AUGRC, for selective classification systems that overcomes the limitations of existing metrics by providing a more holistic and interpretable assessment …
Optimal deep learning of holomorphic operators between Banach spaces
·1700 words·8 mins·
loading
·
loading
π’ Simon Fraser University
Deep learning optimally learns holomorphic operators between Banach spaces, achieving near-optimal generalization bounds with problem-agnostic DNN architectures.
On the Use of Anchoring for Training Vision Models
·1917 words·9 mins·
loading
·
loading
Image Classification
π’ Lawrence Livermore National Laboratory
Boosting vision model training: A new anchored training protocol with a simple regularizer significantly enhances generalization and safety, surpassing standard methods.
Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features
·3306 words·16 mins·
loading
·
loading
Image Generation
π’ Institute of Information Engineering, Chinese Academy of Sciences
Unlocking superior discriminative features from diffusion models, this research reveals key activation properties for effective feature selection, surpassing state-of-the-art methods.
Nonlocal Attention Operator: Materializing Hidden Knowledge Towards Interpretable Physics Discovery
·1663 words·8 mins·
loading
·
loading
π’ Lehigh University
New neural operator, Nonlocal Attention Operator (NAO), simultaneously learns forward and inverse physical models, improving interpretability and generalizability for physics discovery.
Nonlinear dynamics of localization in neural receptive fields
·1762 words·9 mins·
loading
·
loading
Unsupervised Learning
π’ Yale University
Neural receptive fields’ localization emerges from nonlinear learning dynamics driven by naturalistic data’s higher-order statistics, not just sparsity.
Non-convolutional graph neural networks.
·2234 words·11 mins·
loading
·
loading
Graph Neural Networks
π’ New York University
RUM neural network, a novel non-convolutional GNN, overcomes limitations of conventional convolution-based models by using RNNs to merge topological and semantic features along random walks, achieving…
Non-Asymptotic Uncertainty Quantification in High-Dimensional Learning
·1985 words·10 mins·
loading
·
loading
π’ RWTH Aachen University
Data-driven approach corrects confidence intervals in high-dimensional learning, improving accuracy for various models and bridging theory and practice.
Non-asymptotic Approximation Error Bounds of Parameterized Quantum Circuits
·1430 words·7 mins·
loading
·
loading
π’ Wuhan University
New non-asymptotic approximation error bounds show that parameterized quantum circuits can efficiently approximate complex functions, potentially surpassing classical neural networks.
Noisy Label Learning with Instance-Dependent Outliers: Identifiability via Crowd Wisdom
·2347 words·12 mins·
loading
·
loading
π’ Oregon State University
Crowd wisdom solves noisy label learning!
Neural Krylov Iteration for Accelerating Linear System Solving
·2149 words·11 mins·
loading
·
loading
π’ University of Science and Technology of China
Neural Krylov Iteration (NeurKItt) accelerates linear system solving by using a neural operator to predict invariant subspaces, drastically reducing iteration counts and computation time.
Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models
·2318 words·11 mins·
loading
·
loading
Image Generation
π’ Google DeepMind
Neural Assets enables intuitive 3D multi-object scene editing via image diffusion models by using per-object representations to control individual object poses, achieving state-of-the-art results.
Neglected Hessian component explains mysteries in sharpness regularization
·1889 words·9 mins·
loading
·
loading
π’ Google DeepMind
Deep learning’s mysteries surrounding sharpness regularization are solved by uncovering the crucial role of the neglected Hessian component, the Nonlinear Modeling Error (NME).
Multistable Shape from Shading Emerges from Patch Diffusion
·2364 words·12 mins·
loading
·
loading
3D Vision
π’ Harvard University
A novel diffusion model reconstructs multimodal shape distributions from shading, mirroring human multistable perception.
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
·3642 words·18 mins·
loading
·
loading
Multimodal Learning
Multimodal Understanding
π’ ETH Zurich
MultiOOD benchmark and novel A2D & NP-Mix algorithms drastically improve multimodal out-of-distribution detection.
Multilingual Diversity Improves Vision-Language Representations
·2777 words·14 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
π’ University of Washington
Boosting vision-language models: Multilingual data improves performance on English-centric benchmarks.