🏢 Tencent AI Lab

Weight Diffusion for Future: Learn to Generalize in Non-Stationary Environments

26 September 2024·2419 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Tencent AI Lab

Weight Diffusion (W-Diff) masters evolving domain generalization by using conditional diffusion models to learn classifier weight evolution patterns, enabling superior generalization to unseen future …

Visual Perception by Large Language Model’s Weights

26 September 2024·2070 words·10 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Tencent AI Lab

VLORA: Boosting Multimodal LLMs efficiency by merging visual features into model weights instead of extending input sequences.

VFIMamba: Video Frame Interpolation with State Space Models

26 September 2024·2179 words·11 mins· loading · loading

Computer Vision Video Understanding 🏢 Tencent AI Lab

VFIMamba uses state-space models for efficient and dynamic video frame interpolation, achieving state-of-the-art results by introducing a novel Mixed-SSM Block and curriculum learning.

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

26 September 2024·2046 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tencent AI Lab

ALPHALLM boosts LLM performance in complex reasoning tasks by using imagination, search, and criticism to create a self-improving loop, eliminating the need for extra training data.

The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection

26 September 2024·2465 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Tencent AI Lab

Researchers found that superior OOD detection performance comes at the cost of reduced generalization. Their novel Decoupled Uncertainty Learning (DUL) algorithm harmonizes OOD detection and generali…

StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving

26 September 2024·4275 words·21 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab

StrategyLLM uses four LLM agents to generate consistent, generalizable few-shot prompts, significantly improving LLM problem-solving performance across various tasks.

Self-playing Adversarial Language Game Enhances LLM Reasoning

26 September 2024·2197 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tencent AI Lab

Self-play adversarial language game boosts LLM reasoning!

SAFE: Slow and Fast Parameter-Efﬁcient Tuning for Continual Learning with Pre-Trained Models

26 September 2024·2317 words·11 mins· loading · loading

Machine Learning Continual Learning 🏢 Tencent AI Lab

SAFE, a novel parameter-efficient tuning framework, boosts pre-trained model performance in continual learning by balancing model stability and plasticity through slow and fast learning stages, signif…

RobIR: Robust Inverse Rendering for High-Illumination Scenes

26 September 2024·2339 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tencent AI Lab

RobIR: Robust inverse rendering in high-illumination scenes using ACES tone mapping and regularized visibility estimation for accurate BRDF reconstruction.

RLE: A Unified Perspective of Data Augmentation for Cross-Spectral Re-Identification

26 September 2024·1804 words·9 mins· loading · loading

Computer Vision Face Recognition 🏢 Tencent AI Lab

RLE: A novel data augmentation strategy unifying cross-spectral re-ID, significantly boosting model performance by mimicking local linear transformations.

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation

26 September 2024·2214 words·11 mins· loading · loading

Question Answering 🏢 Tencent AI Lab

RG-SAN achieves state-of-the-art 3D referring expression segmentation by leveraging spatial awareness and rule-guided weak supervision, significantly improving accuracy and handling of ambiguous descr…

Opponent Modeling with In-context Search

26 September 2024·2301 words·11 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Tencent AI Lab

Opponent Modeling with In-context Search (OMIS) leverages in-context learning and decision-time search for stable and effective opponent adaptation in multi-agent environments.

On the Worst Prompt Performance of Large Language Models

26 September 2024·2797 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab

LLMs’ performance drastically varies depending on prompt phrasing; this paper introduces ROBUSTAL-PACAEVAL to evaluate lower-bound performance via worst-case prompt analysis, revealing model inconsist…

M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation

26 September 2024·4046 words·19 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Tencent AI Lab

M³GPT, a novel multimodal framework, achieves superior motion comprehension and generation by integrating text, music, and motion data into a unified LLM representation.

Improving Gloss-free Sign Language Translation by Reducing Representation Density

26 September 2024·3386 words·16 mins· loading · loading

AI Generated Natural Language Processing Machine Translation 🏢 Tencent AI Lab

SignCL, a novel contrastive learning strategy, significantly boosts gloss-free sign language translation by mitigating representation density, achieving substantial performance gains.

IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation

26 September 2024·2083 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tencent AI Lab

IDGen synthesizes LLM evaluation prompts using Item Discrimination theory, creating a more challenging and discriminative dataset than previous methods.

From Instance Training to Instruction Learning: Task Adapters Generation from Instructions

26 September 2024·2311 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tencent AI Lab

TAGI, a novel method, generates task-specific adapters from instructions, enhancing LLM cross-task generalization by using knowledge distillation and a two-stage hypernetwork training process.

Efficient Multi-task Reinforcement Learning with Cross-Task Policy Guidance

26 September 2024·3190 words·15 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Tencent AI Lab

Boost multi-task reinforcement learning with Cross-Task Policy Guidance (CTPG)! CTPG cleverly uses policies from already mastered tasks to guide the learning of new tasks, significantly improving effi…

DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion

26 September 2024·2705 words·13 mins· loading · loading

AI Generated Computer Vision Face Recognition 🏢 Tencent AI Lab

DiffusionFake enhances deepfake detection by cleverly reversing the image generation process, enabling detectors to learn more robust features and significantly improve cross-domain generalization.

Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models

26 September 2024·2902 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab

Diffusion-of-Thought (DoT) boosts reasoning in diffusion language models by enabling parallel reasoning steps, outperforming larger autoregressive models in speed and accuracy.