Computer Vision
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
·4960 words·24 mins·
loading
·
loading
AI Generated
Computer Vision
Few-Shot Learning
🏢 Australian Institute for Machine Learning
aTLAS: a novel parameter-efficient fine-tuning method using learned anisotropic scaling of task vectors for enhanced knowledge composition and transfer.
Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features
·2426 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Peking University
Key-Grid: An unsupervised 3D keypoint detector achieving state-of-the-art semantic consistency and accuracy for both rigid and deformable objects using novel grid heatmap features.
Just Add $100 More: Augmenting Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem
·4026 words·19 mins·
loading
·
loading
AI Generated
Computer Vision
Object Detection
🏢 Korea University
Boost 3D object detection accuracy by augmenting pseudo-LiDAR point clouds!
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models.
·3743 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 University of Glasgow
Pixelsmith: Generate gigapixel images with a single GPU, surpassing limitations of existing methods through a cascading approach and innovative guidance mechanism.
Is Multiple Object Tracking a Matter of Specialization?
·2391 words·12 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 University of Modena and Reggio Emilia
PASTA: A novel modular framework boosts MOT tracker generalization by using parameter-efficient fine-tuning and avoiding negative interference through specialized modules for various scenario attribut…
IR-CM: The Fast and Universal Image Restoration Method Based on Consistency Model
·2449 words·12 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Huazhong University of Science and Technology
IR-CM: One-step image restoration using a novel consistency model for fast and universal performance.
IODA: Instance-Guided One-shot Domain Adaptation for Super-Resolution
·2808 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Nanjing University
IODA achieves efficient one-shot domain adaptation for super-resolution using a novel instance-guided strategy and image-level domain alignment, significantly improving performance with limited target…
Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
·4066 words·20 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 HSE University
Invertible Consistency Distillation (iCD) achieves high-quality image editing in ~7 steps by enabling both fast editing and strong generation using a generalized distillation framework and dynamic cla…
Interpreting the Weight Space of Customized Diffusion Models
·3822 words·18 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 UC Berkeley
Researchers model a manifold of customized diffusion models as a subspace of weights, enabling controllable creation of new models via sampling, editing, and inversion from a single image.
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors
·1664 words·8 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 York University
Interpretable lightweight transformers are built by unrolling graph smoothness priors, achieving high performance with significantly fewer parameters than conventional transformers.
Interpretable Image Classification with Adaptive Prototype-based Vision Transformers
·5008 words·24 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Dartmouth College
ProtoViT: a novel interpretable image classification method using Vision Transformers and adaptive prototypes, achieving higher accuracy and providing clear explanations.
Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation
·2210 words·11 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 UC Los Angeles
Deep metric learning and Coreset integration enables efficient slice-based active learning for 3D medical segmentation, surpassing existing methods in performance with low annotation budgets.
Initializing Variable-sized Vision Transformers from Learngene with Learnable Transformation
·2536 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 School of Computer Science and Engineering, Southeast University
LeTs: Learnable Transformation efficiently initializes variable-sized Vision Transformers by learning adaptable transformations from a compact learngene module, outperforming from-scratch training.
Infinite-Dimensional Feature Interaction
·1877 words·9 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Peking University
InfiNet achieves state-of-the-art results by enabling feature interaction in an infinite-dimensional space using RBF kernels, surpassing models limited to finite-dimensional interactions.
Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors
·3586 words·17 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
This research presents LocalN2NM, a novel method for inferring neural signed distance functions (SDF) from single, noisy point clouds by finetuning data-driven priors, achieving faster inference and b…
Incorporating Test-Time Optimization into Training with Dual Networks for Human Mesh Recovery
·2718 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 South China University of Technology
Meta-learning enhances human mesh recovery by unifying training and test-time objectives, significantly improving accuracy and generalization.
In-N-Out: Lifting 2D Diffusion Prior for 3D Object Removal via Tuning-Free Latents Alignment
·2437 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Melbourne
In-N-Out: Lifting 2D Diffusion Priors for 3D Object Removal via Tuning-Free Latents Alignment enhances 3D scene reconstruction by aligning 2D diffusion model latents for consistent multi-view inpainti…
In-Context Symmetries: Self-Supervised Learning through Contextual World Models
·3570 words·17 mins·
loading
·
loading
Computer Vision
Self-Supervised Learning
🏢 MIT CSAIL
CONTEXTSSL: A novel self-supervised learning algorithm that adapts to task-specific symmetries by using context, achieving significant performance gains over existing methods.
In Pursuit of Causal Label Correlations for Multi-label Image Recognition
·2377 words·12 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Wenzhou University
This research leverages causal intervention to identify and utilize genuine label correlations in multi-label image recognition, mitigating contextual bias for improved accuracy.
Improving Viewpoint-Independent Object-Centric Representations through Active Viewpoint Selection
·2484 words·12 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 School of Computer Science, Fudan University
Active Viewpoint Selection (AVS) significantly improves viewpoint-independent object-centric representations by actively selecting the most informative viewpoints for each scene, leading to better seg…