Skip to main content

Computer Vision

Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis
·2012 words·10 mins· loading · loading
Computer Vision 3D Vision 🏢 Zhejiang University
eFreeSplat: a novel, epipolar-free 3D Gaussian splatting model for generalizable novel view synthesis, surpassing state-of-the-art methods by achieving superior geometry reconstruction and novel view …
EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models
·2906 words·14 mins· loading · loading
Computer Vision Image Restoration 🏢 Samsung Research
EnsIR: Training-free image restoration ensemble via Gaussian mixture models, boosting accuracy efficiently.
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
·3757 words·18 mins· loading · loading
AI Generated Computer Vision Image Classification 🏢 Boston University
DiChaViT boosts channel-adaptive vision transformers by enhancing feature diversity, yielding a 1.5-5% accuracy gain over state-of-the-art MCI models on diverse datasets.
Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
·2128 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 Technion
This paper introduces a novel post-processing technique that significantly boosts the perceptual quality of images generated by consistency models using a joint classifier-discriminator adversarially …
End-to-End Video Semantic Segmentation in Adverse Weather using Fusion Blocks and Temporal-Spatial Teacher-Student Learning
·2581 words·13 mins· loading · loading
AI Generated Computer Vision Video Understanding 🏢 National University of Singapore
Optical-flow-free video semantic segmentation excels in adverse weather by merging adjacent frame information via a fusion block and a novel temporal-spatial teacher-student learning strategy.
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
·2466 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Tsinghua University
EfficientNAT: a novel approach to token-based image synthesis boosts performance and slashes computational costs by cleverly disentangling and optimizing spatial-temporal interactions between image to…
EMVP: Embracing Visual Foundation Model for Visual Place Recognition with Centroid-Free Probing
·3286 words·16 mins· loading · loading
AI Generated Computer Vision Visual Question Answering 🏢 State Key Lab of CAD&CG, Zhejiang University
EMVP: A novel PEFT pipeline boosts Visual Place Recognition accuracy by 97.6% using Centroid-Free Probing & Dynamic Power Normalization, saving 64.3% of parameters.
EM Distillation for One-step Diffusion Models
·3404 words·16 mins· loading · loading
Computer Vision Image Generation 🏢 Google DeepMind
EM Distillation (EMD) efficiently trains one-step diffusion models by using an Expectation-Maximization approach, achieving state-of-the-art image generation quality and outperforming existing methods…
Elucidating the Design Space of Dataset Condensation
·4063 words·20 mins· loading · loading
Computer Vision Image Classification 🏢 Tsinghua University
Elucidating Dataset Condensation (EDC) achieves state-of-the-art accuracy in dataset condensation by implementing soft category-aware matching and a smoothing learning rate schedule, improving model t…
EGSST: Event-based Graph Spatiotemporal Sensitive Transformer for Object Detection
·2236 words·11 mins· loading · loading
AI Generated Computer Vision Object Detection 🏢 School of Information Science and Technology, Fudan University
EGSST: a novel framework for event-based object detection, uses graph structures and transformers to efficiently process event data, achieving high accuracy and speed in dynamic scenes.
EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views
·2364 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Science and Technology of China
EgoChoir: a novel framework harmonizes visual appearance, head motion, and 3D objects to accurately estimate 3D human contact and object affordance from egocentric videos, surpassing existing methods.
EfficientCAPER: An End-to-End Framework for Fast and Robust Category-Level Articulated Object Pose Estimation
·2239 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 Zhejiang University of Technology
EfficientCAPER: A novel end-to-end framework achieves fast & robust category-level articulated object pose estimation by using a joint-centric approach, eliminating post-processing optimization and en…
Efficient Temporal Action Segmentation via Boundary-aware Query Voting
·3348 words·16 mins· loading · loading
AI Generated Computer Vision Video Understanding 🏢 Stony Brook University
BaFormer: a novel boundary-aware Transformer network achieves efficient and accurate temporal action segmentation by using instance and global queries for segment classification and boundary predictio…
Efficient Lifelong Model Evaluation in an Era of Rapid Progress
·2830 words·14 mins· loading · loading
AI Generated Computer Vision Image Classification 🏢 University of Cambridge
Sort & Search: 1000x faster lifelong model evaluation!
Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
·1907 words·9 mins· loading · loading
Computer Vision Image Classification 🏢 College of Information and Control Engineering, Xi'an University of Architecture and Technology
Boosting Vision Transformer adaptation! Householder Transformation-based Adaptor (HTA) outperforms existing methods by dynamically adjusting adaptation matrix ranks across layers, improving efficiency…
Effective Rank Analysis and Regularization for Enhanced 3D Gaussian Splatting
·2803 words·14 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 KAIST
Effective rank regularization enhances 3D Gaussian splatting, resolving needle-like artifacts and improving 3D model quality.
EEG2Video: Towards Decoding Dynamic Visual Perception from EEG Signals
·2189 words·11 mins· loading · loading
Computer Vision Video Understanding 🏢 Microsoft Research
EEG2Video reconstructs dynamic videos from EEG signals, achieving 79.8% accuracy in semantic classification and 0.256 SSIM in video reconstruction.
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching
·3842 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Midea Group
The Efficient Diffusion Transformer (EDT) framework significantly speeds up and improves image generation by leveraging a lightweight architecture, human-like sketching-inspired Attention Modulation M…
ECMamba: Consolidating Selective State Space Model with Retinex Guidance for Efficient Multiple Exposure Correction
·1754 words·9 mins· loading · loading
Computer Vision Image Generation 🏢 McMaster University
ECMamba: A novel dual-branch framework efficiently corrects multiple exposure images by integrating Retinex theory and an innovative 2D selective state-space layer, achieving state-of-the-art performa…
EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding
·2574 words·13 mins· loading · loading
AI Generated Computer Vision Image Segmentation 🏢 University of Arkansas
EAGLE: A novel unsupervised cross-view adaptation method for semantic segmentation achieves state-of-the-art performance by efficiently modeling geometric structural changes across different camera vi…