🏢 Hong Kong University of Science and Technology

Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting

26 September 2024·3838 words·19 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology

This work proposes FACL, a novel loss function for precipitation nowcasting, improving forecast sharpness and meteorological skill without sacrificing accuracy.

Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification

26 September 2024·3123 words·15 mins· loading · loading

AI Generated Machine Learning Few-Shot Learning 🏢 Hong Kong University of Science and Technology

Fast Graph Sharpness-Aware Minimization (FGSAM) accelerates few-shot node classification by cleverly combining GNNs and MLPs for efficient, high-performing training.

Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

26 September 2024·2478 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Hong Kong University of Science and Technology

Era3D: High-resolution multiview diffusion using efficient row-wise attention, generates high-quality multiview images from single views, overcoming prior limitations.

Enhancing Robustness of Graph Neural Networks on Social Media with Explainable Inverse Reinforcement Learning

26 September 2024·1977 words·10 mins· loading · loading

AI Theory Robustness 🏢 Hong Kong University of Science and Technology

MoE-BiEntIRL: A novel explainable inverse reinforcement learning method enhances GNN robustness against diverse social media attacks by reconstructing attacker policies and generating more robust trai…

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

26 September 2024·3018 words·15 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Hong Kong University of Science and Technology

Dual Risk Minimization (DRM) improves fine-tuned zero-shot models’ robustness by combining empirical and worst-case risk minimization, using LLMs to identify core features, achieving state-of-the-art …

Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models

26 September 2024·1939 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

DSA, a novel automated framework, discovers optimal sparsity allocation for layer-wise LLM pruning, achieving significant performance gains across various models and tasks.

DiffHammer: Rethinking the Robustness of Diffusion-Based Adversarial Purification

26 September 2024·3686 words·18 mins· loading · loading

AI Theory Robustness 🏢 Hong Kong University of Science and Technology

DiffHammer unveils weaknesses in diffusion-based adversarial defenses by introducing a novel attack bypassing existing evaluation limitations, leading to more robust security solutions.

DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering

26 September 2024·3655 words·18 mins· loading · loading

Computer Vision 3D Vision 🏢 Hong Kong University of Science and Technology

DEL: Learns 3D particle dynamics from 2D images via physics-informed neural rendering, exceeding existing methods’ accuracy and robustness.

Decentralized Noncooperative Games with Coupled Decision-Dependent Distributions

26 September 2024·1853 words·9 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Hong Kong University of Science and Technology

Decentralized noncooperative games with coupled decision-dependent distributions are analyzed, providing novel equilibrium concepts, uniqueness conditions, and a decentralized algorithm with sublinear…

D2R2: Diffusion-based Representation with Random Distance Matching for Tabular Few-shot Learning

26 September 2024·1776 words·9 mins· loading · loading

Machine Learning Few-Shot Learning 🏢 Hong Kong University of Science and Technology

D2R2: A novel diffusion-based model for tabular few-shot learning, achieves state-of-the-art results by leveraging semantic knowledge and distance matching.

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

26 September 2024·4007 words·19 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Hong Kong University of Science and Technology

CoMat: Aligning text-to-image diffusion models using image-to-text concept matching for superior text-image alignment.

CigTime: Corrective Instruction Generation Through Inverse Motion Editing

26 September 2024·2228 words·11 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 Hong Kong University of Science and Technology

CigTime generates corrective motion instructions from motion pairs using motion editing and large language models. This innovative approach improves upon baselines by leveraging motion triplets for f…

ChatCam: Empowering Camera Control through Conversational AI

26 September 2024·1805 words·9 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Hong Kong University of Science and Technology

ChatCam empowers users to control cameras via natural language, using CineGPT for text-conditioned trajectory generation and an Anchor Determinator for precise placement, enabling high-quality video r…

Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding

26 September 2024·2178 words·11 mins· loading · loading

Computer Vision Image Segmentation 🏢 Hong Kong University of Science and Technology

GPTrack: A novel unsupervised framework enhances cardiac motion tracking by using sequential Gaussian processes and bidirectional recurrence, improving accuracy and efficiency.

Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective

26 September 2024·3387 words·16 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Hong Kong University of Science and Technology

Attraos: a novel long-term time series forecasting model leveraging chaos theory, significantly outperforms existing methods by utilizing attractor dynamics for efficient and accurate prediction.

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

26 September 2024·2583 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

ANAH-v2 tackles LLM hallucination by introducing a self-training framework that iteratively scales annotation datasets and improves annotator accuracy, achieving state-of-the-art results.

Adaptive Passive-Aggressive Framework for Online Regression with Side Information

26 September 2024·2153 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 Hong Kong University of Science and Technology

Adaptive Passive-Aggressive framework with Side information (APAS) significantly boosts online regression accuracy by dynamically adjusting thresholds and integrating side information, leading to supe…

Adaptive Domain Learning for Cross-domain Image Denoising

26 September 2024·2302 words·11 mins· loading · loading

Computer Vision Image Denoising 🏢 Hong Kong University of Science and Technology

Adaptive Domain Learning (ADL) efficiently trains a cross-domain RAW image denoising model using limited target data and existing source data by intelligently discarding harmful source data and levera…

Achieving $ ilde{O}(1/psilon)$ Sample Complexity for Constrained Markov Decision Process

26 September 2024·390 words·2 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 Hong Kong University of Science and Technology

Constrained Markov Decision Processes (CMDPs) get an improved sample complexity bound of Õ(1/ε) via a new algorithm, surpassing the existing O(1/ε²) bound.

$ extit{Bifr"ost}$: 3D-Aware Image Compositing with Language Instructions

26 September 2024·3407 words·16 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Hong Kong University of Science and Technology

Bifröst: A novel 3D-aware framework for instruction-based image compositing, leveraging depth maps and an MLLM for high-fidelity results.