π’ Hong Kong University of Science and Technology
Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting
·3838 words·19 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
π’ Hong Kong University of Science and Technology
This work proposes FACL, a novel loss function for precipitation nowcasting, improving forecast sharpness and meteorological skill without sacrificing accuracy.
Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification
·3123 words·15 mins·
loading
·
loading
AI Generated
Machine Learning
Few-Shot Learning
π’ Hong Kong University of Science and Technology
Fast Graph Sharpness-Aware Minimization (FGSAM) accelerates few-shot node classification by cleverly combining GNNs and MLPs for efficient, high-performing training.
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
·2478 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
π’ Hong Kong University of Science and Technology
Era3D: High-resolution multiview diffusion using efficient row-wise attention, generates high-quality multiview images from single views, overcoming prior limitations.
Enhancing Robustness of Graph Neural Networks on Social Media with Explainable Inverse Reinforcement Learning
·1977 words·10 mins·
loading
·
loading
AI Theory
Robustness
π’ Hong Kong University of Science and Technology
MoE-BiEntIRL: A novel explainable inverse reinforcement learning method enhances GNN robustness against diverse social media attacks by reconstructing attacker policies and generating more robust trai…
Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models
·3018 words·15 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
π’ Hong Kong University of Science and Technology
Dual Risk Minimization (DRM) improves fine-tuned zero-shot models’ robustness by combining empirical and worst-case risk minimization, using LLMs to identify core features, achieving state-of-the-art …
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
·1939 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Hong Kong University of Science and Technology
DSA, a novel automated framework, discovers optimal sparsity allocation for layer-wise LLM pruning, achieving significant performance gains across various models and tasks.
DiffHammer: Rethinking the Robustness of Diffusion-Based Adversarial Purification
·3686 words·18 mins·
loading
·
loading
AI Theory
Robustness
π’ Hong Kong University of Science and Technology
DiffHammer unveils weaknesses in diffusion-based adversarial defenses by introducing a novel attack bypassing existing evaluation limitations, leading to more robust security solutions.
DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering
·3655 words·18 mins·
loading
·
loading
Computer Vision
3D Vision
π’ Hong Kong University of Science and Technology
DEL: Learns 3D particle dynamics from 2D images via physics-informed neural rendering, exceeding existing methods’ accuracy and robustness.
Decentralized Noncooperative Games with Coupled Decision-Dependent Distributions
·1853 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Hong Kong University of Science and Technology
Decentralized noncooperative games with coupled decision-dependent distributions are analyzed, providing novel equilibrium concepts, uniqueness conditions, and a decentralized algorithm with sublinear…
D2R2: Diffusion-based Representation with Random Distance Matching for Tabular Few-shot Learning
·1776 words·9 mins·
loading
·
loading
Machine Learning
Few-Shot Learning
π’ Hong Kong University of Science and Technology
D2R2: A novel diffusion-based model for tabular few-shot learning, achieves state-of-the-art results by leveraging semantic knowledge and distance matching.
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
·4007 words·19 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
π’ Hong Kong University of Science and Technology
CoMat: Aligning text-to-image diffusion models using image-to-text concept matching for superior text-image alignment.
CigTime: Corrective Instruction Generation Through Inverse Motion Editing
·2228 words·11 mins·
loading
·
loading
Natural Language Processing
Vision-Language Models
π’ Hong Kong University of Science and Technology
CigTime generates corrective motion instructions from motion pairs using motion editing and large language models. This innovative approach improves upon baselines by leveraging motion triplets for f…
ChatCam: Empowering Camera Control through Conversational AI
·1805 words·9 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
π’ Hong Kong University of Science and Technology
ChatCam empowers users to control cameras via natural language, using CineGPT for text-conditioned trajectory generation and an Anchor Determinator for precise placement, enabling high-quality video r…
Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding
·2178 words·11 mins·
loading
·
loading
Computer Vision
Image Segmentation
π’ Hong Kong University of Science and Technology
GPTrack: A novel unsupervised framework enhances cardiac motion tracking by using sequential Gaussian processes and bidirectional recurrence, improving accuracy and efficiency.
Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective
·3387 words·16 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Hong Kong University of Science and Technology
Attraos: a novel long-term time series forecasting model leveraging chaos theory, significantly outperforms existing methods by utilizing attractor dynamics for efficient and accurate prediction.
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
·2583 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Hong Kong University of Science and Technology
ANAH-v2 tackles LLM hallucination by introducing a self-training framework that iteratively scales annotation datasets and improves annotator accuracy, achieving state-of-the-art results.
Adaptive Passive-Aggressive Framework for Online Regression with Side Information
·2153 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Hong Kong University of Science and Technology
Adaptive Passive-Aggressive framework with Side information (APAS) significantly boosts online regression accuracy by dynamically adjusting thresholds and integrating side information, leading to supe…
Adaptive Domain Learning for Cross-domain Image Denoising
·2302 words·11 mins·
loading
·
loading
Computer Vision
Image Denoising
π’ Hong Kong University of Science and Technology
Adaptive Domain Learning (ADL) efficiently trains a cross-domain RAW image denoising model using limited target data and existing source data by intelligently discarding harmful source data and levera…
Achieving $ ilde{O}(1/psilon)$ Sample Complexity for Constrained Markov Decision Process
·390 words·2 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Hong Kong University of Science and Technology
Constrained Markov Decision Processes (CMDPs) get an improved sample complexity bound of Γ(1/Ξ΅) via a new algorithm, surpassing the existing O(1/Ρ²) bound.
$ extit{Bifr"ost}$: 3D-Aware Image Compositing with Language Instructions
·3407 words·16 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
π’ Hong Kong University of Science and Technology
BifrΓΆst: A novel 3D-aware framework for instruction-based image compositing, leveraging depth maps and an MLLM for high-fidelity results.