Skip to main content

🏒 University of Hong Kong

Zero-shot Image Editing with Reference Imitation
·2284 words·11 mins· loading · loading
Computer Vision Image Generation 🏒 University of Hong Kong
MimicBrush: a novel image editing approach using reference imitation for intuitive zero-shot edits.
What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
·3275 words·16 mins· loading · loading
Multimodal Learning Vision-Language Models 🏒 University of Hong Kong
CLIP’s robustness to long-tailed pre-training data stems from its dynamic classification task and descriptive language supervision, offering transferable insights for improving model generalizability.
Task-oriented Time Series Imputation Evaluation via Generalized Representers
·3611 words·17 mins· loading · loading
AI Generated AI Applications Healthcare 🏒 University of Hong Kong
Task-oriented time series imputation is revolutionized! This research introduces a novel approach that efficiently assesses imputation strategies based on downstream task performance without costly mo…
SyncVIS: Synchronized Video Instance Segmentation
·2160 words·11 mins· loading · loading
Computer Vision Video Understanding 🏒 University of Hong Kong
SyncVIS: A new framework for video instance segmentation achieves state-of-the-art results by synchronously modeling video and frame-level information, overcoming limitations of asynchronous approache…
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
·5133 words·25 mins· loading · loading
Large Language Models 🏒 University of Hong Kong
Stacking Your Transformers accelerates LLM pre-training by leveraging smaller, pre-trained models to efficiently train larger ones, achieving significant speedups and improved performance.
Splatter a Video: Video Gaussian Representation for Versatile Processing
·2610 words·13 mins· loading · loading
Computer Vision Video Understanding 🏒 University of Hong Kong
Researchers introduce Video Gaussian Representation (VGR) for versatile video processing, embedding videos into explicit 3D Gaussians for intuitive motion and appearance modeling.
Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs
·2646 words·13 mins· loading · loading
Reinforcement Learning 🏒 University of Hong Kong
ArithTreeRL, a novel reinforcement learning approach, generates optimized arithmetic tree structures for adders and multipliers, significantly improving computational efficiency and reducing hardware …
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution
·3263 words·16 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 University of Hong Kong
MAGIS: A novel LLM-based multi-agent framework significantly boosts GitHub issue resolution by leveraging agent collaboration for planning and coding, achieving an eight-fold performance increase comp…
KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
·1673 words·8 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Hong Kong
KptLLM: A novel multimodal model leverages LLMs for superior keypoint comprehension, outperforming existing methods in various benchmarks.
How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
·2236 words·11 mins· loading · loading
AI Generated Machine Learning Few-Shot Learning 🏒 University of Hong Kong
Multi-head transformers utilize distinct attention patterns across layersβ€”multiple heads are essential for initial data preprocessing, while a single head suffices for subsequent optimization steps, o…
HHD-GP: Incorporating Helmholtz-Hodge Decomposition into Gaussian Processes for Learning Dynamical Systems
·1903 words·9 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Hong Kong
HHD-GP leverages Helmholtz-Hodge decomposition within Gaussian Processes to learn physically meaningful components of dynamical systems, enhancing prediction accuracy and interpretability.
EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization
·2876 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Hong Kong
EFFI-LEARNER: A novel self-optimization framework dramatically improves the efficiency of LLM-generated code by iteratively refining code based on execution profiles.
AutoPSV: Automated Process-Supervised Verifier
·2548 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Hong Kong
AutoPSV automates process annotation for LLMs, improving reasoning by detecting confidence shifts in reasoning steps, thus efficiently enhancing model performance.