Skip to main content

🏢 Tsinghua University

Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection
·1770 words·9 mins· loading · loading
Machine Learning Unsupervised Learning 🏢 Tsinghua University
SCD-Tree & GBD: Unlocking interpretable rules for unsupervised anomaly detection!
Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting
·2524 words·12 mins· loading · loading
Machine Learning Transfer Learning 🏢 Tsinghua University
Diff-Tuning: a simple yet effective approach transfers pre-trained diffusion models to various downstream tasks by leveraging the ‘chain of forgetting’ phenomenon, improving transferability and conver…
Diffusion Models are Certifiably Robust Classifiers
·1735 words·9 mins· loading · loading
AI Theory Robustness 🏢 Tsinghua University
Diffusion models are certifiably robust classifiers due to their inherent O(1) Lipschitzness, a property further enhanced by generalizing to noisy data, achieving over 80% certified robustness on CIFA…
Diffusion Actor-Critic with Entropy Regulator
·2005 words·10 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 Tsinghua University
DACER, a novel online RL algorithm, uses diffusion models to learn complex policies and adaptively balances exploration-exploitation via entropy estimation, achieving state-of-the-art performance on M…
DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model
·1985 words·10 mins· loading · loading
Computer Vision Object Detection 🏢 Tsinghua University
DI-MaskDINO: Novel model significantly boosts object detection & instance segmentation accuracy by addressing performance imbalance using a De-Imbalance module and Balance-Aware Tokens Optimization.
Dense Connector for MLLMs
·3198 words·16 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Tsinghua University
Boosting multimodal LLMs, the Dense Connector efficiently integrates multi-layer visual features for significantly enhanced performance.
Demystify Mamba in Vision: A Linear Attention Perspective
·2184 words·11 mins· loading · loading
Computer Vision Image Classification 🏢 Tsinghua University
Vision’s Mamba model demystified: Researchers unveil its surprising link to linear attention, improving efficiency and accuracy through design enhancements.
DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching
·3798 words·18 mins· loading · loading
AI Applications Forecasting 🏢 Tsinghua University
DeformableTST: a new Transformer model for time series forecasting that surpasses existing methods by reducing over-reliance on patching, enhancing performance and adaptability.
DeepLag: Discovering Deep Lagrangian Dynamics for Intuitive Fluid Prediction
·3777 words·18 mins· loading · loading
Machine Learning Deep Learning 🏢 Tsinghua University
DeepLag improves fluid prediction by uniquely combining Lagrangian and Eulerian perspectives, tracking key particles to reveal hidden dynamics and improve prediction accuracy.
Decoding-Time Language Model Alignment with Multiple Objectives
·3392 words·16 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Tsinghua University
Multi-objective decoding (MOD) efficiently aligns language models to diverse user needs by decoding the next token from a weighted combination of predictions from multiple base models trained on indiv…
DDN: Dual-domain Dynamic Normalization for Non-stationary Time Series Forecasting
·2680 words·13 mins· loading · loading
Machine Learning Deep Learning 🏢 Tsinghua University
DDN: Dual-domain Dynamic Normalization dynamically improves time series forecasting accuracy by addressing data distribution changes in both time and frequency domains via a plug-in module.
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
·1939 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Tsinghua University
DART-Math tackles LLM limitations in mathematical problem-solving by introducing Difficulty-Aware Rejection Tuning, a novel method that generates high-quality, bias-reduced datasets, resulting in supe…
DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection
·1673 words·8 mins· loading · loading
Multimodal Learning Audio-Visual Learning 🏢 Tsinghua University
DARNet: a dual attention network for auditory attention detection surpasses current state-of-the-art models, especially in short decision windows, achieving this with a 91% reduction in parameters.
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
·2236 words·11 mins· loading · loading
Computer Vision Video Understanding 🏢 Tsinghua University
COVE: Consistent high-quality video editing achieved by leveraging diffusion feature correspondence for temporal consistency.
COSMIC: Compress Satellite Image Efficiently via Diffusion Compensation
·3381 words·16 mins· loading · loading
Computer Vision Image Compression 🏢 Tsinghua University
COSMIC efficiently compresses satellite images via a lightweight encoder and diffusion compensation, enabling practical onboard processing and high compression ratios.
CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
·1838 words·9 mins· loading · loading
AI Applications Robotics 🏢 Tsinghua University
CooHOI: A two-phase learning framework enables physically simulated characters to perform cooperative object transportation tasks naturally and efficiently, overcoming the limitations of existing meth…
Contextual Linear Optimization with Bandit Feedback
·1748 words·9 mins· loading · loading
AI Theory Optimization 🏢 Tsinghua University
This paper introduces induced empirical risk minimization for contextual linear optimization with bandit feedback, providing theoretical guarantees and computationally tractable solutions for improved…
Consistency Diffusion Bridge Models
·431 words·3 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Tsinghua University
Consistency Diffusion Bridge Models (CDBMs) dramatically speed up diffusion bridge model sampling by learning a consistency function, achieving up to a 50x speedup with improved sample quality.
Co-occurrence is not Factual Association in Language Models
·1941 words·10 mins· loading · loading
Large Language Models 🏢 Tsinghua University
Language models struggle to learn facts; this study reveals they prioritize word co-occurrence over true factual associations, proposing new training strategies for improved factual knowledge generali…
Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
·2760 words·13 mins· loading · loading
AI Applications Robotics 🏢 Tsinghua University
CLOVER: A closed-loop visuomotor framework using generative visual plans & feedback mechanisms achieves state-of-the-art results in long-horizon robotic manipulation tasks.