Skip to main content

🏒 Fudan University

Unified Lexical Representation for Interpretable Visual-Language Alignment
·1730 words·9 mins· loading · loading
Multimodal Learning Vision-Language Models 🏒 Fudan University
LexVLA: A novel visual-language alignment framework learns unified lexical representations for improved interpretability and efficient cross-modal retrieval.
Towards Global Optimal Visual In-Context Learning Prompt Selection
·2618 words·13 mins· loading · loading
AI Generated Computer Vision Image Segmentation 🏒 Fudan University
Partial2Global: A novel VICL framework achieving globally optimal prompt selection, significantly improving visual in-context learning across various tasks.
Tetrahedron Splatting for 3D Generation
·2346 words·12 mins· loading · loading
3D Vision 🏒 Fudan University
TeT-Splatting: a novel 3D representation enabling fast convergence, real-time rendering, and precise mesh extraction for high-fidelity 3D generation.
Taming Generative Diffusion Prior for Universal Blind Image Restoration
·4450 words·21 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 Fudan University
BIR-D tames generative diffusion models for universal blind image restoration, dynamically updating parameters to handle various complex degradations without assuming degradation model types.
TAIA: Large Language Models are Out-of-Distribution Data Learners
·2712 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Fudan University
LLMs struggle with downstream tasks using mismatched data. TAIA, a novel inference-time method, solves this by selectively using only attention parameters during inference after training all parameter…
SpeechAlign: Aligning Speech Generation to Human Preferences
·1822 words·9 mins· loading · loading
Natural Language Processing Text Generation 🏒 Fudan University
SpeechAlign: Iteratively aligning speech generation models to human preferences via preference optimization, bridging distribution gaps for improved speech quality.
S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning
·2415 words·12 mins· loading · loading
Machine Learning Deep Learning 🏒 Fudan University
S2HPruner bridges the discretization gap in neural network pruning via a novel soft-to-hard distillation framework, achieving superior performance across various benchmarks without fine-tuning.
Penalty-based Methods for Simple Bilevel Optimization under HΓΆlderian Error Bounds
·1969 words·10 mins· loading · loading
Machine Learning Optimization 🏒 Fudan University
This paper proposes penalty-based methods for simple bilevel optimization, achieving (Ρ, Ρβ)-optimal solutions with improved complexity under Hâlderian error bounds.
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
·3629 words·18 mins· loading · loading
Computer Vision 3D Vision 🏒 Fudan University
MVInpainter: Pose-free multi-view consistent inpainting bridges 2D and 3D editing by simplifying 3D editing to a multi-view 2D inpainting task.
MSA Generation with Seqs2Seqs Pretraining: Advancing Protein Structure Predictions
·2100 words·10 mins· loading · loading
Machine Learning Self-Supervised Learning 🏒 Fudan University
Self-supervised generative model MSA-Generator boosts protein structure prediction accuracy by producing high-quality MSAs, especially for challenging sequences lacking homologs.
Motion Forecasting in Continuous Driving
·1965 words·10 mins· loading · loading
AI Applications Autonomous Vehicles 🏒 Fudan University
RealMotion: a novel motion forecasting framework for continuous driving that outperforms existing methods by accumulating historical scene information and sequentially refining predictions, achieving …
Mixtures of Experts for Audio-Visual Learning
·2112 words·10 mins· loading · loading
Multimodal Learning Audio-Visual Learning 🏒 Fudan University
AVMoE: a novel parameter-efficient transfer learning approach for audio-visual learning, dynamically allocates expert models (unimodal and cross-modal adapters) based on task demands, achieving superi…
MeLLoC: Lossless Compression with High-order Mechanism Learning
·1838 words·9 mins· loading · loading
AI Generated AI Applications Healthcare 🏒 Fudan University
MeLLoC: Mechanism Learning for Lossless Compression, a novel approach that combines high-order mechanism learning with classical encoding, significantly improves lossless compression for scientific da…
Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
·2500 words·12 mins· loading · loading
Multimodal Learning Vision-Language Models 🏒 Fudan University
Lumen: A novel LMM architecture decouples perception learning into task-agnostic and task-specific stages, enabling versatile vision-centric capabilities and surpassing existing LMM-based approaches.
Low Precision Local Training is Enough for Federated Learning
·2011 words·10 mins· loading · loading
Machine Learning Federated Learning 🏒 Fudan University
Low-precision local training, surprisingly, is sufficient for accurate federated learning, significantly reducing communication and computation costs.
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Control and Rendering
·2138 words·11 mins· loading · loading
Multimodal Learning Vision-Language Models 🏒 Fudan University
LiveScene: Language-embedded interactive radiance fields efficiently reconstruct and control complex scenes with multiple interactive objects, achieving state-of-the-art results.
Knowledge Graph Completion by Intermediate Variables Regularization
·2107 words·10 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Fudan University
Novel intermediate variables regularization boosts knowledge graph completion!
Iterative Methods via Locally Evolving Set Process
·3065 words·15 mins· loading · loading
AI Theory Optimization 🏒 Fudan University
This paper proposes a novel framework, the locally evolving set process, to develop faster localized iterative methods for solving large-scale graph problems, achieving significant speedup over existi…
GenRec: Unifying Video Generation and Recognition with Diffusion Models
·2342 words·11 mins· loading · loading
Computer Vision Video Understanding 🏒 Fudan University
GenRec: One diffusion model to rule both video generation and recognition!
FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation
·2089 words·10 mins· loading · loading
AI Applications Autonomous Vehicles 🏒 Fudan University
Fourier Neural Processes (FNP) revolutionizes data assimilation by enabling accurate analysis of observations with varying resolutions, improving weather forecasting and Earth system modeling.