π’ Tencent AI Lab
Weight Diffusion for Future: Learn to Generalize in Non-Stationary Environments
·2419 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Tencent AI Lab
Weight Diffusion (W-Diff) masters evolving domain generalization by using conditional diffusion models to learn classifier weight evolution patterns, enabling superior generalization to unseen future …
Visual Perception by Large Language Modelβs Weights
·2070 words·10 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
π’ Tencent AI Lab
VLORA: Boosting Multimodal LLMs efficiency by merging visual features into model weights instead of extending input sequences.
VFIMamba: Video Frame Interpolation with State Space Models
·2179 words·11 mins·
loading
·
loading
Computer Vision
Video Understanding
π’ Tencent AI Lab
VFIMamba uses state-space models for efficient and dynamic video frame interpolation, achieving state-of-the-art results by introducing a novel Mixed-SSM Block and curriculum learning.
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
·2046 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
ALPHALLM boosts LLM performance in complex reasoning tasks by using imagination, search, and criticism to create a self-improving loop, eliminating the need for extra training data.
The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection
·2465 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Tencent AI Lab
Researchers found that superior OOD detection performance comes at the cost of reduced generalization. Their novel Decoupled Uncertainty Learning (DUL) algorithm harmonizes OOD detection and generali…
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving
·4275 words·21 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
StrategyLLM uses four LLM agents to generate consistent, generalizable few-shot prompts, significantly improving LLM problem-solving performance across various tasks.
Self-playing Adversarial Language Game Enhances LLM Reasoning
·2197 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
Self-play adversarial language game boosts LLM reasoning!
SAFE: Slow and Fast Parameter-Efο¬cient Tuning for Continual Learning with Pre-Trained Models
·2317 words·11 mins·
loading
·
loading
Machine Learning
Continual Learning
π’ Tencent AI Lab
SAFE, a novel parameter-efficient tuning framework, boosts pre-trained model performance in continual learning by balancing model stability and plasticity through slow and fast learning stages, signif…
RobIR: Robust Inverse Rendering for High-Illumination Scenes
·2339 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
π’ Tencent AI Lab
RobIR: Robust inverse rendering in high-illumination scenes using ACES tone mapping and regularized visibility estimation for accurate BRDF reconstruction.
RLE: A Unified Perspective of Data Augmentation for Cross-Spectral Re-Identification
·1804 words·9 mins·
loading
·
loading
Computer Vision
Face Recognition
π’ Tencent AI Lab
RLE: A novel data augmentation strategy unifying cross-spectral re-ID, significantly boosting model performance by mimicking local linear transformations.
RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
·2214 words·11 mins·
loading
·
loading
Question Answering
π’ Tencent AI Lab
RG-SAN achieves state-of-the-art 3D referring expression segmentation by leveraging spatial awareness and rule-guided weak supervision, significantly improving accuracy and handling of ambiguous descr…
Opponent Modeling with In-context Search
·2301 words·11 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Tencent AI Lab
Opponent Modeling with In-context Search (OMIS) leverages in-context learning and decision-time search for stable and effective opponent adaptation in multi-agent environments.
On the Worst Prompt Performance of Large Language Models
·2797 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
LLMs’ performance drastically varies depending on prompt phrasing; this paper introduces ROBUSTAL-PACAEVAL to evaluate lower-bound performance via worst-case prompt analysis, revealing model inconsist…
M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation
·4046 words·19 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
π’ Tencent AI Lab
MΒ³GPT, a novel multimodal framework, achieves superior motion comprehension and generation by integrating text, music, and motion data into a unified LLM representation.
Improving Gloss-free Sign Language Translation by Reducing Representation Density
·3386 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Machine Translation
π’ Tencent AI Lab
SignCL, a novel contrastive learning strategy, significantly boosts gloss-free sign language translation by mitigating representation density, achieving substantial performance gains.
IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation
·2083 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
IDGen synthesizes LLM evaluation prompts using Item Discrimination theory, creating a more challenging and discriminative dataset than previous methods.
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
·2311 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
TAGI, a novel method, generates task-specific adapters from instructions, enhancing LLM cross-task generalization by using knowledge distillation and a two-stage hypernetwork training process.
Efficient Multi-task Reinforcement Learning with Cross-Task Policy Guidance
·3190 words·15 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Tencent AI Lab
Boost multi-task reinforcement learning with Cross-Task Policy Guidance (CTPG)! CTPG cleverly uses policies from already mastered tasks to guide the learning of new tasks, significantly improving effi…
DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion
·2705 words·13 mins·
loading
·
loading
AI Generated
Computer Vision
Face Recognition
π’ Tencent AI Lab
DiffusionFake enhances deepfake detection by cleverly reversing the image generation process, enabling detectors to learn more robust features and significantly improve cross-domain generalization.
Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models
·2902 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Tencent AI Lab
Diffusion-of-Thought (DoT) boosts reasoning in diffusion language models by enabling parallel reasoning steps, outperforming larger autoregressive models in speed and accuracy.