🏢 University of Science and Technology of China

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

26 September 2024·2364 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Science and Technology of China

EgoChoir: a novel framework harmonizes visual appearance, head motion, and 3D objects to accurately estimate 3D human contact and object affordance from egocentric videos, surpassing existing methods.

DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering

26 September 2024·2765 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Science and Technology of China

DN-4DGS: Real-time dynamic scene rendering is revolutionized by a denoised deformable network with temporal-spatial aggregation, achieving state-of-the-art quality.

Differentiable Structure Learning with Partial Orders

26 September 2024·2110 words·10 mins· loading · loading

AI Theory Causality 🏢 University of Science and Technology of China

This research introduces a novel plug-and-play module that efficiently integrates prior partial order constraints into differentiable structure learning, significantly improving structure recovery qua…

Decompose, Analyze and Rethink: Solving Intricate Problems with Human-like Reasoning Cycle

26 September 2024·2295 words·11 mins· loading · loading

Question Answering 🏢 University of Science and Technology of China

DeAR: A novel framework lets LLMs solve complex problems with human-like iterative reasoning.

Customizing Language Models with Instance-wise LoRA for Sequential Recommendation

26 September 2024·1854 words·9 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Science and Technology of China

Instance-wise LoRA (iLoRA) boosts LLM sequential recommendation accuracy by customizing model parameters for each user, mitigating negative transfer and improving performance.

Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model

26 September 2024·2290 words·11 mins· loading · loading

AI Applications Finance 🏢 University of Science and Technology of China

CaPaint: a novel causal spatio-temporal prediction framework that uses causal reasoning and diffusion inpainting to boost model accuracy and generalizability, especially in data-scarce settings.

Breaking Long-Tailed Learning Bottlenecks: A Controllable Paradigm with Hypernetwork-Generated Diverse Experts

26 September 2024·2226 words·11 mins· loading · loading

Few-Shot Learning 🏢 University of Science and Technology of China

Controllable long-tailed learning achieved via hypernetwork-generated diverse experts, adapting to user preferences and distribution shifts.

Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model

26 September 2024·1801 words·9 mins· loading · loading

AI Generated Natural Language Processing Recommendation Systems 🏢 University of Science and Technology of China

DDSR: a novel sequential recommendation model uses fuzzy sets and discrete diffusion to capture user behavior randomness, outperforming existing methods.

Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing

26 September 2024·3128 words·15 mins· loading · loading

AI Generated Natural Language Processing Semi-Supervised Learning 🏢 University of Science and Technology of China

ViSu boosts semi-supervised scene text recognition by using an online generation strategy for diverse synthetic data and a novel character alignment loss to improve model generalization and robustness…

Are We on the Right Way for Evaluating Large Vision-Language Models?

26 September 2024·2514 words·12 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 University of Science and Technology of China

MMStar benchmark tackles flawed LVLMs evaluation by focusing on vision-critical samples, minimizing data leakage, and introducing new metrics for fair multi-modal gain assessment.