Skip to main content

🏢 Tsinghua University

Skinned Motion Retargeting with Dense Geometric Interaction Perception
·2892 words·14 mins· loading · loading
AI Applications Gaming 🏢 Tsinghua University
MeshRet: A novel retargeting framework that uses dense geometric interaction modeling for realistic, artifact-free skinned character animation.
ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling
·2221 words·11 mins· loading · loading
Computer Vision Image Generation 🏢 Tsinghua University
ShowMaker: Generating high-fidelity 2D human conversational videos using fine-grained diffusion modeling and 2D key points.
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
·2121 words·10 mins· loading · loading
Multimodal Learning Embodied AI 🏢 Tsinghua University
SG-Nav achieves state-of-the-art zero-shot object navigation by leveraging a novel 3D scene graph to provide rich context for LLM-based reasoning.
Semi-Open 3D Object Retrieval via Hierarchical Equilibrium on Hypergraph
·2346 words·12 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Tsinghua University
HERT: a novel framework for semi-open 3D object retrieval using hierarchical hypergraph equilibrium, achieving state-of-the-art performance on four new benchmark datasets.
Scaling Law for Time Series Forecasting
·2211 words·11 mins· loading · loading
Machine Learning Deep Learning 🏢 Tsinghua University
Unlocking the potential of deep learning for time series forecasting: this study reveals a scaling law influenced by dataset size, model complexity, and the crucial look-back horizon, leading to impro…
Scalable Bayesian Optimization via Focalized Sparse Gaussian Processes
·2563 words·13 mins· loading · loading
AI Generated Machine Learning Optimization 🏢 Tsinghua University
FOCALBO, a hierarchical Bayesian optimization algorithm using focalized sparse Gaussian processes, efficiently tackles high-dimensional problems with massive datasets, achieving state-of-the-art perfo…
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
·2718 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University
S-STE achieves efficient 2:4 sparse pre-training by introducing a novel continuous pruning function, overcoming the limitations of previous methods and leading to improved accuracy and speed.
RoPINN: Region Optimized Physics-Informed Neural Networks
·2557 words·13 mins· loading · loading
AI Theory Optimization 🏢 Tsinghua University
ROPINN: Revolutionizing Physics-Informed Neural Networks with Region Optimization
Revisiting motion information for RGB-Event tracking with MOT philosophy
·2713 words·13 mins· loading · loading
AI Generated Computer Vision Object Detection 🏢 Tsinghua University
RGB-Event tracker CSAM leverages MOT philosophy for enhanced robustness by integrating appearance and motion information from RGB and event streams, achieving state-of-the-art performance.
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
·2799 words·14 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University
ReST-MCTS*: A novel LLM self-training method using process reward guided tree search, outperforming existing methods by generating higher-quality reasoning traces for improved model accuracy.
Reprogramming Pretrained Target-Specific Diffusion Models for Dual-Target Drug Design
·1732 words·9 mins· loading · loading
AI Applications Healthcare 🏢 Tsinghua University
AI-powered dual-target drug design is revolutionized by repurposing pretrained diffusion models, achieving zero-shot transfer learning and outperforming existing methods.
ReFIR: Grounding Large Restoration Models with Retrieval Augmentation
·3091 words·15 mins· loading · loading
Computer Vision Image Generation 🏢 Tsinghua University
ReFIR enhances Large Restoration Models’ accuracy by incorporating retrieved images as external knowledge, mitigating hallucination without retraining.
Recovering Complete Actions for Cross-dataset Skeleton Action Recognition
·2959 words·14 mins· loading · loading
Computer Vision Action Recognition 🏢 Tsinghua University
Boost skeleton action recognition accuracy across datasets by recovering complete actions and resampling; outperforms existing methods.
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
·2585 words·13 mins· loading · loading
Computer Vision Image Generation 🏢 Tsinghua University
RealCompo: A novel training-free framework dynamically balances realism and compositionality in text-to-image generation, achieving state-of-the-art results.
Real-world Image Dehazing with Coherence-based Pseudo Labeling and Cooperative Unfolding Network
·2135 words·11 mins· loading · loading
Image Generation 🏢 Tsinghua University
CORUN-Colabator: a novel cooperative unfolding network and coherence-based label generator achieves state-of-the-art real-world image dehazing by effectively integrating physical knowledge and generat…
Q-VLM: Post-training Quantization for Large Vision-Language Models
·2070 words·10 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Tsinghua University
Q-VLM: A novel post-training quantization framework significantly compresses large vision-language models, boosting inference speed without sacrificing accuracy.
Prediction with Action: Visual Policy Learning via Joint Denoising Process
·2466 words·12 mins· loading · loading
AI Applications Robotics 🏢 Tsinghua University
PAD, a novel visual policy learning framework, unifies image prediction and robot action in a joint denoising process, achieving significant performance improvements in robotic manipulation tasks.
PhyRecon: Physically Plausible Neural Scene Reconstruction
·2451 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Tsinghua University
PHYRECON: A novel neural scene reconstruction method uses differentiable rendering and physics simulation for physically plausible 3D models.
PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning
·3037 words·15 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Tsinghua University
PEAC: a novel unsupervised pre-training method significantly improves cross-embodiment generalization in reinforcement learning, enabling faster adaptation to diverse robots and tasks.
Parameter-Inverted Image Pyramid Networks
·2381 words·12 mins· loading · loading
Object Detection 🏢 Tsinghua University
Parameter-Inverted Image Pyramid Networks (PIIP) boost image pyramid efficiency by using smaller models for higher-resolution images and larger models for lower-resolution ones, achieving superior per…