โ†“Skip to main content

๐Ÿข Hong Kong University of Science and Technology

Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting
ยท3838 wordsยท19 minsยท loading ยท loading
AI Generated Computer Vision Image Generation ๐Ÿข Hong Kong University of Science and Technology
This work proposes FACL, a novel loss function for precipitation nowcasting, improving forecast sharpness and meteorological skill without sacrificing accuracy.
Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification
ยท3123 wordsยท15 minsยท loading ยท loading
AI Generated Machine Learning Few-Shot Learning ๐Ÿข Hong Kong University of Science and Technology
Fast Graph Sharpness-Aware Minimization (FGSAM) accelerates few-shot node classification by cleverly combining GNNs and MLPs for efficient, high-performing training.
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
ยท2478 wordsยท12 minsยท loading ยท loading
Computer Vision 3D Vision ๐Ÿข Hong Kong University of Science and Technology
Era3D: High-resolution multiview diffusion using efficient row-wise attention, generates high-quality multiview images from single views, overcoming prior limitations.
Enhancing Robustness of Graph Neural Networks on Social Media with Explainable Inverse Reinforcement Learning
ยท1977 wordsยท10 minsยท loading ยท loading
AI Theory Robustness ๐Ÿข Hong Kong University of Science and Technology
MoE-BiEntIRL: A novel explainable inverse reinforcement learning method enhances GNN robustness against diverse social media attacks by reconstructing attacker policies and generating more robust traiโ€ฆ
Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models
ยท3018 wordsยท15 minsยท loading ยท loading
AI Generated Multimodal Learning Vision-Language Models ๐Ÿข Hong Kong University of Science and Technology
Dual Risk Minimization (DRM) improves fine-tuned zero-shot modelsโ€™ robustness by combining empirical and worst-case risk minimization, using LLMs to identify core features, achieving state-of-the-art โ€ฆ
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
ยท1939 wordsยท10 minsยท loading ยท loading
Natural Language Processing Large Language Models ๐Ÿข Hong Kong University of Science and Technology
DSA, a novel automated framework, discovers optimal sparsity allocation for layer-wise LLM pruning, achieving significant performance gains across various models and tasks.
DiffHammer: Rethinking the Robustness of Diffusion-Based Adversarial Purification
ยท3686 wordsยท18 minsยท loading ยท loading
AI Theory Robustness ๐Ÿข Hong Kong University of Science and Technology
DiffHammer unveils weaknesses in diffusion-based adversarial defenses by introducing a novel attack bypassing existing evaluation limitations, leading to more robust security solutions.
DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering
ยท3655 wordsยท18 minsยท loading ยท loading
Computer Vision 3D Vision ๐Ÿข Hong Kong University of Science and Technology
DEL: Learns 3D particle dynamics from 2D images via physics-informed neural rendering, exceeding existing methodsโ€™ accuracy and robustness.
Decentralized Noncooperative Games with Coupled Decision-Dependent Distributions
ยท1853 wordsยท9 minsยท loading ยท loading
Machine Learning Reinforcement Learning ๐Ÿข Hong Kong University of Science and Technology
Decentralized noncooperative games with coupled decision-dependent distributions are analyzed, providing novel equilibrium concepts, uniqueness conditions, and a decentralized algorithm with sublinearโ€ฆ
D2R2: Diffusion-based Representation with Random Distance Matching for Tabular Few-shot Learning
ยท1776 wordsยท9 minsยท loading ยท loading
Machine Learning Few-Shot Learning ๐Ÿข Hong Kong University of Science and Technology
D2R2: A novel diffusion-based model for tabular few-shot learning, achieves state-of-the-art results by leveraging semantic knowledge and distance matching.
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
ยท4007 wordsยท19 minsยท loading ยท loading
AI Generated Multimodal Learning Vision-Language Models ๐Ÿข Hong Kong University of Science and Technology
CoMat: Aligning text-to-image diffusion models using image-to-text concept matching for superior text-image alignment.
CigTime: Corrective Instruction Generation Through Inverse Motion Editing
ยท2228 wordsยท11 minsยท loading ยท loading
Natural Language Processing Vision-Language Models ๐Ÿข Hong Kong University of Science and Technology
CigTime generates corrective motion instructions from motion pairs using motion editing and large language models. This innovative approach improves upon baselines by leveraging motion triplets for fโ€ฆ
ChatCam: Empowering Camera Control through Conversational AI
ยท1805 wordsยท9 minsยท loading ยท loading
Multimodal Learning Vision-Language Models ๐Ÿข Hong Kong University of Science and Technology
ChatCam empowers users to control cameras via natural language, using CineGPT for text-conditioned trajectory generation and an Anchor Determinator for precise placement, enabling high-quality video rโ€ฆ
Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding
ยท2178 wordsยท11 minsยท loading ยท loading
Computer Vision Image Segmentation ๐Ÿข Hong Kong University of Science and Technology
GPTrack: A novel unsupervised framework enhances cardiac motion tracking by using sequential Gaussian processes and bidirectional recurrence, improving accuracy and efficiency.
Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective
ยท3387 wordsยท16 minsยท loading ยท loading
AI Generated Machine Learning Deep Learning ๐Ÿข Hong Kong University of Science and Technology
Attraos: a novel long-term time series forecasting model leveraging chaos theory, significantly outperforms existing methods by utilizing attractor dynamics for efficient and accurate prediction.
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
ยท2583 wordsยท13 minsยท loading ยท loading
AI Generated Natural Language Processing Large Language Models ๐Ÿข Hong Kong University of Science and Technology
ANAH-v2 tackles LLM hallucination by introducing a self-training framework that iteratively scales annotation datasets and improves annotator accuracy, achieving state-of-the-art results.
Adaptive Passive-Aggressive Framework for Online Regression with Side Information
ยท2153 wordsยท11 minsยท loading ยท loading
Machine Learning Deep Learning ๐Ÿข Hong Kong University of Science and Technology
Adaptive Passive-Aggressive framework with Side information (APAS) significantly boosts online regression accuracy by dynamically adjusting thresholds and integrating side information, leading to supeโ€ฆ
Adaptive Domain Learning for Cross-domain Image Denoising
ยท2302 wordsยท11 minsยท loading ยท loading
Computer Vision Image Denoising ๐Ÿข Hong Kong University of Science and Technology
Adaptive Domain Learning (ADL) efficiently trains a cross-domain RAW image denoising model using limited target data and existing source data by intelligently discarding harmful source data and leveraโ€ฆ
Achieving $ ilde{O}(1/psilon)$ Sample Complexity for Constrained Markov Decision Process
ยท390 wordsยท2 minsยท loading ยท loading
AI Generated Machine Learning Reinforcement Learning ๐Ÿข Hong Kong University of Science and Technology
Constrained Markov Decision Processes (CMDPs) get an improved sample complexity bound of ร•(1/ฮต) via a new algorithm, surpassing the existing O(1/ฮตยฒ) bound.
$ extit{Bifr"ost}$: 3D-Aware Image Compositing with Language Instructions
ยท3407 wordsยท16 minsยท loading ยท loading
Multimodal Learning Vision-Language Models ๐Ÿข Hong Kong University of Science and Technology
Bifrรถst: A novel 3D-aware framework for instruction-based image compositing, leveraging depth maps and an MLLM for high-fidelity results.