🏢 Tsinghua University

Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts

26 September 2024·3390 words·16 mins· loading · loading

Computer Vision Image Restoration 🏢 Tsinghua University

AdaptIR: A novel parameter-efficient method for generalized image restoration using a heterogeneous Mixture-of-Experts (MoE) architecture that achieves superior performance and generalization.

Online Control with Adversarial Disturbance for Continuous-time Linear Systems

26 September 2024·1592 words·8 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Tsinghua University

This paper presents a novel two-level online control algorithm that learns to control continuous-time linear systems under adversarial disturbances, achieving sublinear regret.

One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection

26 September 2024·2196 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

OneDet3D: A universal 3D object detector trained jointly on diverse indoor/outdoor datasets, achieving one-for-all performance across domains and categories.

On the Saturation Effects of Spectral Algorithms in Large Dimensions

26 September 2024·1464 words·7 mins· loading · loading

AI Theory Generalization 🏢 Tsinghua University

High-dimensional spectral algorithms show saturation effects: Kernel Ridge Regression underperforms optimal algorithms like gradient flow when regression functions are very smooth.

On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory

26 September 2024·1555 words·8 mins· loading · loading

AI Theory Generalization 🏢 Tsinghua University

Standard initialization in neural networks negatively impacts generalization ability under Neural Tangent Kernel theory, contradicting real-world performance, urging the development of improved theore…

Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression

26 September 2024·2487 words·12 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Tsinghua University

Offline RL agents often fail in real-world scenarios due to unseen test states. SCAS, a novel method, simultaneously corrects OOD states to high-value, in-distribution states and suppresses risky OOD …

ODGEN: Domain-specific Object Detection Data Generation with Diffusion Models

26 September 2024·4424 words·21 mins· loading · loading

Computer Vision Object Detection 🏢 Tsinghua University

ODGEN: Boosting object detection accuracy by generating high-quality synthetic images using diffusion models conditioned on bounding boxes and text descriptions.

Not All Tokens Are What You Need for Pretraining

26 September 2024·2178 words·11 mins· loading · loading

Large Language Models 🏢 Tsinghua University

RHO-1, a novel language model, uses selective pretraining focusing on high-value tokens, achieving state-of-the-art results with significantly less data than existing models.

Noise Contrastive Alignment of Language Models with Explicit Rewards

26 September 2024·2166 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

This paper introduces InfoNCA and NCA, novel frameworks for language model alignment using noise contrastive estimation, enabling direct optimization from both explicit rewards and pairwise preference…

Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set

26 September 2024·2791 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

Neural SDF inference is revolutionized by dynamically aligning 3D Gaussians to a neural SDF’s zero-level set, enabling accurate, smooth 3D surface reconstruction.

Neural Residual Diffusion Models for Deep Scalable Vision Generation

26 September 2024·1912 words·9 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Tsinghua University

Neural-RDM: A novel framework for deep, scalable vision generation using residual diffusion models, achieving state-of-the-art results on image and video benchmarks.

Neural Collapse Inspired Feature Alignment for Out-of-Distribution Generalization

26 September 2024·1839 words·9 mins· loading · loading

Machine Learning Deep Learning 🏢 Tsinghua University

Neural Collapse-inspired Feature Alignment (NCFAL) significantly boosts out-of-distribution generalization by aligning semantic features to a simplex ETF, even without environment labels.

MutaPLM: Protein Language Modeling for Mutation Explanation and Engineering

26 September 2024·2665 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

MutaPLM: a novel protein language model, provides human-understandable mutation explanations and designs novel mutations with desirable properties using a unique protein delta network and chain-of-tho…

MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step

26 September 2024·3626 words·18 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

MultiPull: a novel method reconstructing detailed 3D surfaces from raw point clouds using multi-step optimization of multi-level features, significantly improving accuracy and detail.

Multi-scale Consistency for Robust 3D Registration via Hierarchical Sinkhorn Tree

26 September 2024·2306 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

Hierarchical Sinkhorn Tree (HST) robustly retrieves accurate 3D point cloud correspondences using multi-scale consistency, outperforming state-of-the-art methods.

MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution

26 September 2024·2539 words·12 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 Tsinghua University

MSPE empowers Vision Transformers to handle any image resolution by cleverly optimizing patch embedding, achieving superior performance on low-resolution images and comparable results on high-resoluti…

MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training

26 September 2024·3685 words·18 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Tsinghua University

MSAGPT: Revolutionizing protein structure prediction by generating accurate virtual MSAs from limited data, boosting prediction accuracy by up to +8.5% TM-Score!

Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs

26 September 2024·3226 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

Mesa-Extrapolation enhances LLM extrapolation using a novel weave position encoding method, boosting performance while significantly reducing memory and inference time.

Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration

26 September 2024·2645 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

SPV-MIA, a novel membership inference attack, significantly improves the accuracy of identifying training data in fine-tuned LLMs by using self-prompt calibration and probabilistic variation.

MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

26 September 2024·2671 words·13 mins· loading · loading

AI Generated Multimodal Learning Human-AI Interaction 🏢 Tsinghua University

MambaTalk: Efficient holistic gesture synthesis using selective state space models to overcome computational complexity and improve gesture quality.