Paper Reviews by AI
2025
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
·1572 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 ByteDance Intelligent Creation
InfU: A new framework for flexible photo re-creation while preserving identity using Diffusion Transformers(DiTs).
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
·2606 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 National University of Singapore
Coarse-to-Fine Token Prediction improves autoregressive image generation by assigning the same coarse label for similar tokens, balancing generation quality and computational efficiency.
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
·3624 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 University of Copenhagen
GFS-VL: Enhancing few-shot 3D segmentation by synergizing vision-language models with few-shot learning for robust real-world application.
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning
·2985 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Finance
🏢 Shanghai University of Finance and Economics
Fin-R1: Financial reasoning via RL.
Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts
·4277 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 ByteDance Seed
Expert Race: A flexible routing strategy for scaling diffusion transformer with mixture of experts.
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content
·3312 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Text Classification
🏢 IIIT Dharwad
New dataset bridges fabricated claims with humor for spotting online deception!
CLS-RL: Image Classification with Rule-Based Reinforcement Learning
·2967 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 Shanghai AI Laboratory
CLS-RL: Rule-based RL tackles catastrophic forgetting in MLLM image classification, outperforming SFT with better generalization and efficiency.
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
·3734 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 University of California, Los Angeles
CaKE: Editing LLMs to Enhance Knowledge Generalization Across Reasoning Tasks.
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
·3405 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 University of Hong Kong
TokenBridge bridges continuous and discrete tokens for autoregressive visual generation, achieving high-quality synthesis with simple autoregressive modeling.
AIMI: Leveraging Future Knowledge and Personalization in Sparse Event Forecasting for Treatment Adherence
·2151 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Healthcare
🏢 Arizona State University
AIMI: A system leveraging future knowledge & personalized data for accurate treatment adherence forecasting, paving the way for timely mobile interventions.
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
·3897 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 National University of Singapore
4DGS-1K: Achieves 1000+ FPS for dynamic scene rendering via a compact, memory-efficient framework, offering a 41x storage reduction and 9x faster speed.
TULIP: Towards Unified Language-Image Pretraining
·3271 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 UC Berkeley
TULIP enhances image-text pretraining by unifying generative data augmentation with contrastive learning, achieving state-of-the-art performance in visual understanding.
Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling
·2283 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Machine Learning
Deep Learning
🏢 University of Science and Technology of China
UAE-3D: A unified latent space approach for efficient & high-quality 3D molecular generation, outperforming existing methods in accuracy and speed.
Temporal Regularization Makes Your Video Generator Stronger
·3350 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Hong Kong University of Science and Technology
FluxFlow: Make your video generator stronger via temporal regularization!
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
·2233 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Shanghai Artificial Intelligence Laboratory
FakeVLM: A multimodal model & artifact-annotated dataset for detecting synthetic images with interpretable explanations, setting a new benchmark.
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
·3386 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Action Recognition
🏢 Zhejiang University
MotionStreamer: Streaming motion generation w/ diffusion-based autoregressive model in causal latent space.
LEGION: Learning to Ground and Explain for Synthetic Image Detection
·3727 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Shanghai Jiao Tong University
LEGION: Grounding and explaining synthetic image detection and refinement via multimodal learning.
ELTEX: A Framework for Domain-Driven Synthetic Data Generation
·2296 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Security
🏢 Distributed Networks Institute (DNI)
ELTEX: Domain-driven synthetic data generation framework improves LLM performance in cybersecurity with less resources.
Efficient Personalization of Quantized Diffusion Model without Backpropagation
·6238 words·30 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Seoul National University
Personalize diffusion models efficiently on devices without backpropagation.
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
·2721 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tsinghua University
DeepMesh: RL-guided auto-regressive creation of artist-quality 3D meshes, enhanced by tokenization & DPO for human-aligned aesthetics.