Paper Reviews by AI

InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

20 March 2025·1572 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 ByteDance Intelligent Creation

InfU: A new framework for flexible photo re-creation while preserving identity using Diffusion Transformers(DiTs).

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

20 March 2025·2606 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National University of Singapore

Coarse-to-Fine Token Prediction improves autoregressive image generation by assigning the same coarse label for similar tokens, balancing generation quality and computational efficiency.

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

20 March 2025·3624 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 University of Copenhagen

GFS-VL: Enhancing few-shot 3D segmentation by synergizing vision-language models with few-shot learning for robust real-world application.

Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning

20 March 2025·2985 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Finance 🏢 Shanghai University of Finance and Economics

Fin-R1: Financial reasoning via RL.

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

20 March 2025·4277 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 ByteDance Seed

Expert Race: A flexible routing strategy for scaling diffusion transformer with mixture of experts.

Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content

20 March 2025·3312 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Text Classification 🏢 IIIT Dharwad

New dataset bridges fabricated claims with humor for spotting online deception!

CLS-RL: Image Classification with Rule-Based Reinforcement Learning

20 March 2025·2967 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Shanghai AI Laboratory

CLS-RL: Rule-based RL tackles catastrophic forgetting in MLLM image classification, outperforming SFT with better generalization and efficiency.

CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners

20 March 2025·3734 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of California, Los Angeles

CaKE: Editing LLMs to Enhance Knowledge Generalization Across Reasoning Tasks.

Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

20 March 2025·3405 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Hong Kong

TokenBridge bridges continuous and discrete tokens for autoregressive visual generation, achieving high-quality synthesis with simple autoregressive modeling.

AIMI: Leveraging Future Knowledge and Personalization in Sparse Event Forecasting for Treatment Adherence

20 March 2025·2151 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Healthcare 🏢 Arizona State University

AIMI: A system leveraging future knowledge & personalized data for accurate treatment adherence forecasting, paving the way for timely mobile interventions.

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

20 March 2025·3897 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore

4DGS-1K: Achieves 1000+ FPS for dynamic scene rendering via a compact, memory-efficient framework, offering a 41x storage reduction and 9x faster speed.

TULIP: Towards Unified Language-Image Pretraining

19 March 2025·3271 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 UC Berkeley

TULIP enhances image-text pretraining by unifying generative data augmentation with contrastive learning, achieving state-of-the-art performance in visual understanding.

Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling

19 March 2025·2283 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Science and Technology of China

UAE-3D: A unified latent space approach for efficient & high-quality 3D molecular generation, outperforming existing methods in accuracy and speed.

Temporal Regularization Makes Your Video Generator Stronger

19 March 2025·3350 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Hong Kong University of Science and Technology

FluxFlow: Make your video generator stronger via temporal regularization!

Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation

19 March 2025·2233 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Shanghai Artificial Intelligence Laboratory

FakeVLM: A multimodal model & artifact-annotated dataset for detecting synthetic images with interpretable explanations, setting a new benchmark.

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

19 March 2025·3386 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Action Recognition 🏢 Zhejiang University

MotionStreamer: Streaming motion generation w/ diffusion-based autoregressive model in causal latent space.

LEGION: Learning to Ground and Explain for Synthetic Image Detection

19 March 2025·3727 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Shanghai Jiao Tong University

LEGION: Grounding and explaining synthetic image detection and refinement via multimodal learning.

ELTEX: A Framework for Domain-Driven Synthetic Data Generation

19 March 2025·2296 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Security 🏢 Distributed Networks Institute (DNI)

ELTEX: Domain-driven synthetic data generation framework improves LLM performance in cybersecurity with less resources.

Efficient Personalization of Quantized Diffusion Model without Backpropagation

19 March 2025·6238 words·30 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Seoul National University

Personalize diffusion models efficiently on devices without backpropagation.

DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

19 March 2025·2721 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Tsinghua University

DeepMesh: RL-guided auto-regressive creation of artist-quality 3D meshes, enhanced by tokenization & DPO for human-aligned aesthetics.