🏢 University of Science and Technology of China
EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views
·2364 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Science and Technology of China
EgoChoir: a novel framework harmonizes visual appearance, head motion, and 3D objects to accurately estimate 3D human contact and object affordance from egocentric videos, surpassing existing methods.
DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering
·2765 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Science and Technology of China
DN-4DGS: Real-time dynamic scene rendering is revolutionized by a denoised deformable network with temporal-spatial aggregation, achieving state-of-the-art quality.
Differentiable Structure Learning with Partial Orders
·2110 words·10 mins·
loading
·
loading
AI Theory
Causality
🏢 University of Science and Technology of China
This research introduces a novel plug-and-play module that efficiently integrates prior partial order constraints into differentiable structure learning, significantly improving structure recovery qua…
Decompose, Analyze and Rethink: Solving Intricate Problems with Human-like Reasoning Cycle
·2295 words·11 mins·
loading
·
loading
Question Answering
🏢 University of Science and Technology of China
DeAR: A novel framework lets LLMs solve complex problems with human-like iterative reasoning.
Customizing Language Models with Instance-wise LoRA for Sequential Recommendation
·1854 words·9 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Science and Technology of China
Instance-wise LoRA (iLoRA) boosts LLM sequential recommendation accuracy by customizing model parameters for each user, mitigating negative transfer and improving performance.
Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model
·2290 words·11 mins·
loading
·
loading
AI Applications
Finance
🏢 University of Science and Technology of China
CaPaint: a novel causal spatio-temporal prediction framework that uses causal reasoning and diffusion inpainting to boost model accuracy and generalizability, especially in data-scarce settings.
Breaking Long-Tailed Learning Bottlenecks: A Controllable Paradigm with Hypernetwork-Generated Diverse Experts
·2226 words·11 mins·
loading
·
loading
Few-Shot Learning
🏢 University of Science and Technology of China
Controllable long-tailed learning achieved via hypernetwork-generated diverse experts, adapting to user preferences and distribution shifts.
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model
·1801 words·9 mins·
loading
·
loading
AI Generated
Natural Language Processing
Recommendation Systems
🏢 University of Science and Technology of China
DDSR: a novel sequential recommendation model uses fuzzy sets and discrete diffusion to capture user behavior randomness, outperforming existing methods.
Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
·3128 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Semi-Supervised Learning
🏢 University of Science and Technology of China
ViSu boosts semi-supervised scene text recognition by using an online generation strategy for diverse synthetic data and a novel character alignment loss to improve model generalization and robustness…
Are We on the Right Way for Evaluating Large Vision-Language Models?
·2514 words·12 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 University of Science and Technology of China
MMStar benchmark tackles flawed LVLMs evaluation by focusing on vision-critical samples, minimizing data leakage, and introducing new metrics for fair multi-modal gain assessment.