🏢 Tsinghua University
GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling
·2946 words·14 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
GaussianCube revolutionizes 3D generative modeling with a structured, explicit radiance representation, achieving state-of-the-art results using significantly fewer parameters.
Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images
·2277 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
Gaussian Graph Network (GGN) revolutionizes novel view synthesis by efficiently generating generalizable Gaussian representations from multi-view images, achieving superior rendering quality with fewe…
Functionally Constrained Algorithm Solves Convex Simple Bilevel Problem
·310 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 Tsinghua University
Near-optimal algorithms solve convex simple bilevel problems by reformulating them into functionally constrained problems, achieving near-optimal convergence rates.
Full-Distance Evasion of Pedestrian Detectors in the Physical World
·2691 words·13 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 Tsinghua University
Researchers developed Full Distance Attack (FDA) to generate adversarial patterns effective against pedestrian detectors across all distances, resolving the appearance gap issue between simulated and …
Full-Atom Peptide Design with Geometric Latent Diffusion
·2511 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Tsinghua University
PepGLAD, a novel generative model, revolutionizes full-atom peptide design by leveraging geometric latent diffusion to significantly enhance peptide diversity and binding affinity.
From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models
·3334 words·16 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
Diffusion models, while excelling in image generation, are vulnerable to data poisoning. This paper demonstrates a BadNets-like attack’s effectiveness against diffusion models, causing image misalign…
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner
·1980 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
FlowTurbo: Blazing-fast, high-quality flow-based image generation via a velocity refiner!
FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation
·3141 words·15 mins·
loading
·
loading
Machine Learning
Federated Learning
🏢 Tsinghua University
FedLPA: One-shot federated learning with layer-wise posterior aggregation improves model accuracy in non-IID data by efficiently aggregating layer-wise posteriors of local models using a novel approac…
Exploring Adversarial Robustness of Deep State Space Models
·1844 words·9 mins·
loading
·
loading
AI Theory
Robustness
🏢 Tsinghua University
Deep state space models (SSMs) gain adversarial robustness through an adaptive scaling mechanism, improving performance without overfitting issues.
Expanding Sparse Tuning for Low Memory Usage
·2517 words·12 mins·
loading
·
loading
Computer Vision
Transfer Learning
🏢 Tsinghua University
SNELL: Sparse tuning with kerNElized LoRA achieves state-of-the-art parameter-efficient fine-tuning performance with drastically reduced memory usage.
Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor
·2050 words·10 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Tsinghua University
Researchers introduce object-aware backdoors in Vision-and-Language Navigation, enabling malicious behavior upon encountering specific objects, demonstrating the vulnerability of real-world AI agents.
Event-3DGS: Event-based 3D Reconstruction Using 3D Gaussian Splatting
·2242 words·11 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Tsinghua University
Event-3DGS: First event-based 3D reconstruction using 3D Gaussian splatting, enabling high-quality, efficient, and robust 3D scene reconstruction in challenging real-world conditions.
Enhancing Protein Mutation Effect Prediction through a Retrieval-Augmented Framework
·1980 words·10 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Tsinghua University
Revolutionizing protein mutation effect prediction, this work introduces a retrieval-augmented framework achieving state-of-the-art accuracy by efficiently incorporating similar local structure inform…
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
·2466 words·12 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
EfficientNAT: a novel approach to token-based image synthesis boosts performance and slashes computational costs by cleverly disentangling and optimizing spatial-temporal interactions between image to…
Elucidating the Design Space of Dataset Condensation
·4063 words·20 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Tsinghua University
Elucidating Dataset Condensation (EDC) achieves state-of-the-art accuracy in dataset condensation by implementing soft category-aware matching and a smoothing learning rate schedule, improving model t…
DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs
·4529 words·22 mins·
loading
·
loading
Large Language Models
🏢 Tsinghua University
DuQuant: Dual transformations distribute outliers for stronger quantized LLMs.
Dual Critic Reinforcement Learning under Partial Observability
·2549 words·12 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Tsinghua University
DCRL, a Dual Critic Reinforcement Learning framework, effectively mitigates high variance in reinforcement learning under partial observability by synergistically combining an oracle critic (with full…
Doubly Mild Generalization for Offline Reinforcement Learning
·2279 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Tsinghua University
Doubly Mild Generalization (DMG) improves offline reinforcement learning by selectively leveraging generalization beyond training data, achieving state-of-the-art results.
DiTFastAttn: Attention Compression for Diffusion Transformer Models
·2788 words·14 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
DiTFastAttn: A post-training compression method drastically speeds up diffusion transformer models by cleverly reducing redundancy in attention calculations, leading to up to a 1.8x speedup at high re…
Distribution-Aware Data Expansion with Diffusion Models
·3351 words·16 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 Tsinghua University
DistDiff, a training-free data expansion framework, leverages distribution-aware diffusion models to generate high-fidelity, diverse samples that enhance downstream model performance.