2025-04-01s
2025
TeleAntiFraud-28k: A Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection
·2047 words·10 mins·
loading
·
loading
AI Generated
π€ Daily Papers
AI Applications
Security
π’ China Mobile Internet Company Ltd.
TeleAntiFraud-28k: A new audio-text dataset designed for telecom fraud detection, tackles data scarcity with innovative synthesis techniques and slow-thinking annotations.
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy
·3587 words·17 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Embodied AI
π’ Zhejiang University
RIG: Synergizes reasoning and imagination in an end-to-end generalist policy for embodied agents, improving sample efficiency and generalization.
Query and Conquer: Execution-Guided SQL Generation
·2511 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Text Generation
π’ Snowflake AI Research
Execution-guided SQL generation enhances accuracy in text-to-SQL tasks by using execution results to select the most semantically consistent query, improving performance and reducing inference costs.
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
·3072 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Reinforcement Learning
π’ StepFun
Open-Reasoner-Zero pioneers scalable, accessible RL training for reasoning in LLMs, achieving superior performance with a minimalist approach.
KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language
·1493 words·8 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Vision-Language Models
π’ MAUM AI Inc.
KOFFVQA: Objectively evaluates Korean VLMs with a new free-form VQA benchmark, improving evaluation reliability via detailed grading criteria.
Expanding RL with Verifiable Rewards Across Diverse Domains
·3127 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Reinforcement Learning
π’ Tencent AI Lab
RL with Verifiable Rewards is now expanding to diverse domains like medicine!
Entropy-Based Adaptive Weighting for Self-Training
·1608 words·8 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ University of California Los Angeles
EAST: Prioritizing uncertainty in self-training refines reasoning of Large Language Models.
Effectively Controlling Reasoning Models through Thinking Intervention
·3981 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
AI Theory
Safety
π’ Princeton University
Thinking Intervention offers a novel paradigm for controlling reasoning in LLMs, enabling fine-grained guidance and improvements in instruction-following and safety.
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
·3146 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
3D Vision
π’ Westlake University
Easi3R: Training-free 4D reconstruction via attention disentanglement, enabling dynamic scene understanding from static 3D models.
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
·1952 words·10 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Nanjing University
TextCrafter: Precisely renders multiple texts in complex scenes, overcoming distortion and omission issues in existing visual text generation models.
MoCha: Towards Movie-Grade Talking Character Synthesis
·2382 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Multimodal Generation
π’ University of Waterloo
MoCha: Movie-Grade Talking Character Synthesis!
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
·1669 words·8 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
3D Vision
π’ Tsinghua University
MeshCraft: Efficient, controllable mesh generation using flow-based DiTs, outperforming auto-regressive methods in speed and user control.
Efficient Inference for Large Reasoning Models: A Survey
·857 words·5 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ National University of Singapore
Survey on efficient inference methods for Large Reasoning Models, focusing on mitigating token inefficiency while preserving quality.
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
·2618 words·13 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
3D Vision
π’ Hong Kong Polytechnic University
Adapting Stable Diffusion for faster Text-to-Mesh Generation, PRD efficiently creates high-quality 3D models without needing extensive 3D training data.
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization
·3042 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
3D Vision
π’ Shanghai AI Laboratory
TokenHSI: Unified Transformer for Physical Human-Scene Interactions through Task Tokenization.
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code
·1748 words·9 mins·
loading
·
loading
AI Generated
π€ Daily Papers
AI Applications
Robotics
π’ University of Oxford
LLMs generate Python heuristics for classical planning, outperforming traditional methods and challenging the state-of-the-art planning techniques.
Decoupling Angles and Strength in Low-rank Adaptation
·3846 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Deep Learning
π’ University of TΓΌbingen
DeLoRA: Decoupling angles and strength in low-rank adaptation for robust & efficient finetuning of large models!
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
·3142 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Vision-Language Models
π’ Peking University
UPME: Peer review for MLLMs, minus human bias!