🏢 National University of Singapore

Efficient Inference for Large Reasoning Models: A Survey

29 March 2025·857 words·5 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

Survey on efficient inference methods for Large Reasoning Models, focusing on mitigating token inefficiency while preserving quality.

LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation

25 March 2025·3935 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 National University of Singapore

LogQuant: 2-bit quantization for KV cache, superior accuracy!

MedAgent-Pro: Towards Multi-modal Evidence-based Medical Diagnosis via Reasoning Agentic Workflow

21 March 2025·1815 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Healthcare 🏢 National University of Singapore

MedAgent-Pro: An evidence-based reasoning agentic system for reliable multi-modal medical diagnosis.

Ultra-Resolution Adaptation with Ease

20 March 2025·2457 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National University of Singapore

URA: Ultra-resolution adaptation made easy! Uses synthetic data & minor weight tuning for efficient, high-res text-to-image diffusion models.

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

20 March 2025·2606 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National University of Singapore

Coarse-to-Fine Token Prediction improves autoregressive image generation by assigning the same coarse label for similar tokens, balancing generation quality and computational efficiency.

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

20 March 2025·3897 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore

4DGS-1K: Achieves 1000+ FPS for dynamic scene rendering via a compact, memory-efficient framework, offering a 41x storage reduction and 9x faster speed.

Impossible Videos

18 March 2025·4228 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 National University of Singapore

Impossible videos expose AI limits!

TPDiff: Temporal Pyramid Video Diffusion Model

12 March 2025·2081 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 National University of Singapore

TPDiff accelerates video diffusion by progressively increasing frame rates during diffusion, optimizing computational efficiency with a novel stage-wise training strategy.

PE3R: Perception-Efficient 3D Reconstruction

10 March 2025·2061 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore

PE3R: Achieves fast and accurate 3D scene reconstruction from 2D images by enhanced perception and efficiency.

Words or Vision: Do Vision-Language Models Have Blind Faith in Text?

4 March 2025·5020 words·24 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 National University of Singapore

VLMs often disproportionately trust text over visual data, leading to performance drops and safety concerns.

Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling

27 February 2025·3037 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore

EDGS: Achieves faster, high-quality dynamic scene rendering by sparse time-variant attribute modeling and intelligent static area filtering.

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

20 February 2025·1606 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National University of Singapore

PhotoDoodle: Mimicking artistic image editing with personalized decorative elements through learning from few-shot pairwise data.

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

20 February 2025·3063 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Human-AI Interaction 🏢 National University of Singapore

InterFeedback: LMMs need better human feedback to enhance AI assistants!

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

19 February 2025·2370 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

LongPO: Self-evolve LLMs to excel in long contexts via short-to-long preference optimization, boosting performance without sacrificing short-context skills.

NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation

18 February 2025·6586 words·31 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 National University of Singapore

NExT-Mol: Combines 1D language models with 3D diffusion for molecule generation, achieving state-of-the-art performance and validity.

CoT-Valve: Length-Compressible Chain-of-Thought Tuning

13 February 2025·3429 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

CoT-Valve dynamically adjusts reasoning chain lengths based on task difficulty, significantly reducing inference costs in large language models without substantial accuracy loss.

Enhance-A-Video: Better Generated Video for Free

11 February 2025·3320 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 National University of Singapore

Enhance-A-Video boosts video generation quality without retraining, by enhancing cross-frame correlations in diffusion transformers, resulting in improved coherence and visual fidelity.

GuardReasoner: Towards Reasoning-based LLM Safeguards

30 January 2025·5624 words·27 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

GuardReasoner enhances LLM safety with reasoning-based guardrails, improving performance, explainability, and generalization on various benchmarks.

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

20 December 2024·4398 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National University of Singapore

CLEAR: Conv-Like Linearization boosts pre-trained Diffusion Transformers, achieving 6.3x faster 8K image generation with minimal quality loss.

TinyFusion: Diffusion Transformers Learned Shallow

2 December 2024·4225 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National University of Singapore

TinyFusion, a novel learnable depth pruning method, crafts efficient shallow diffusion transformers with superior post-fine-tuning performance, achieving a 2x speedup with less than 7% of the original…