🏢 University of Hong Kong

Zero-shot Image Editing with Reference Imitation

26 September 2024·2284 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 University of Hong Kong

MimicBrush: a novel image editing approach using reference imitation for intuitive zero-shot edits.

What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights

26 September 2024·3275 words·16 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 University of Hong Kong

CLIP’s robustness to long-tailed pre-training data stems from its dynamic classification task and descriptive language supervision, offering transferable insights for improving model generalizability.

Task-oriented Time Series Imputation Evaluation via Generalized Representers

26 September 2024·3611 words·17 mins· loading · loading

AI Generated AI Applications Healthcare 🏢 University of Hong Kong

Task-oriented time series imputation is revolutionized! This research introduces a novel approach that efficiently assesses imputation strategies based on downstream task performance without costly mo…

SyncVIS: Synchronized Video Instance Segmentation

26 September 2024·2160 words·11 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Hong Kong

SyncVIS: A new framework for video instance segmentation achieves state-of-the-art results by synchronously modeling video and frame-level information, overcoming limitations of asynchronous approache…

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

26 September 2024·5133 words·25 mins· loading · loading

Large Language Models 🏢 University of Hong Kong

Stacking Your Transformers accelerates LLM pre-training by leveraging smaller, pre-trained models to efficiently train larger ones, achieving significant speedups and improved performance.

Splatter a Video: Video Gaussian Representation for Versatile Processing

26 September 2024·2610 words·13 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Hong Kong

Researchers introduce Video Gaussian Representation (VGR) for versatile video processing, embedding videos into explicit 3D Gaussians for intuitive motion and appearance modeling.

Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs

26 September 2024·2646 words·13 mins· loading · loading

Reinforcement Learning 🏢 University of Hong Kong

ArithTreeRL, a novel reinforcement learning approach, generates optimized arithmetic tree structures for adders and multipliers, significantly improving computational efficiency and reducing hardware …

MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

26 September 2024·3263 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Hong Kong

MAGIS: A novel LLM-based multi-agent framework significantly boosts GitHub issue resolution by leveraging agent collaboration for planning and coding, achieving an eight-fold performance increase comp…

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension

26 September 2024·1673 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Hong Kong

KptLLM: A novel multimodal model leverages LLMs for superior keypoint comprehension, outperforming existing methods in various benchmarks.

How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression

26 September 2024·2236 words·11 mins· loading · loading

AI Generated Machine Learning Few-Shot Learning 🏢 University of Hong Kong

Multi-head transformers utilize distinct attention patterns across layers—multiple heads are essential for initial data preprocessing, while a single head suffices for subsequent optimization steps, o…

HHD-GP: Incorporating Helmholtz-Hodge Decomposition into Gaussian Processes for Learning Dynamical Systems

26 September 2024·1903 words·9 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Hong Kong

HHD-GP leverages Helmholtz-Hodge decomposition within Gaussian Processes to learn physically meaningful components of dynamical systems, enhancing prediction accuracy and interpretability.

EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization

26 September 2024·2876 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Hong Kong

EFFI-LEARNER: A novel self-optimization framework dramatically improves the efficiency of LLM-generated code by iteratively refining code based on execution profiles.

AutoPSV: Automated Process-Supervised Verifier

26 September 2024·2548 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Hong Kong

AutoPSV automates process annotation for LLMs, improving reasoning by detecting confidence shifts in reasoning steps, thus efficiently enhancing model performance.