Skip to main content

Machine Learning

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation
·3963 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Recommender Systems 🏢 Gaoling School of Artificial Intelligence, Renmin University of China
ReaRec: Unleashing latent reasoning power for sequential recommendation through inference-time multi-step reasoning.
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
·3814 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 ByteDance Seed
This paper enhances Reinforcement Learning from Human Feedback (RLHF) by tackling reward hacking and response diversity issues through improved data construction methods.
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation
·3935 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 National University of Singapore
LogQuant: 2-bit quantization for KV cache, superior accuracy!
Verbal Process Supervision Elicits Better Coding Agents
·1306 words·7 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Mindify AI, United States
CURA: Verbal process supervision improves coding agents.
Gumbel-Softmax Flow Matching with Straight-Through Guidance for Controllable Biological Sequence Generation
·3836 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Department of Biomedical Engineering, Duke University
Gumbel-Softmax Flow Matching enables controllable biological sequence generation with straight-through guidance, scaling efficiently to high-dimensional simplices.
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
·1719 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 VNU University of Science, Vietnam
RL fine-tuning enhances reasoning in small LLMs, achieving competitive performance with limited resources, despite optimization & length challenges.
Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling
·2283 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Science and Technology of China
UAE-3D: A unified latent space approach for efficient & high-quality 3D molecular generation, outperforming existing methods in accuracy and speed.
Frac-Connections: Fractional Extension of Hyper-Connections
·1945 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 ByteDance Seed
Frac-Connections: An efficient alternative to Hyper-Connections that divides hidden states into fractions.
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
·3349 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Tsinghua University
DAPO: Open-sources a LLM reinforcement learning system that achieves SOTA AIME scores, fostering reproducible research at scale.
Transformers without Normalization
·4050 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 FAIR, Meta
Transformers can achieve state-of-the-art performance without normalization layers via Dynamic Tanh (DyT), offering a simpler and more efficient alternative.
Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?
·3607 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Central Florida
KArAt: Can Learnable Attention Beat Standard Attention in Vision Transformers?
Charting and Navigating Hugging Face's Model Atlas
·3697 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 School of Computer Science and Engineering
Navigating millions of models is hard. This paper charts Hugging Face, revealing model relationships and attribute predictions.
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
·4375 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Carnegie Mellon University
LLMs can now reason more efficiently!
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling
·1373 words·7 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Not Available
Rimer: RWKV-7 empowers superior time series modeling, offering a simple yet effective alternative to Transformers with fewer parameters.
LoRACode: LoRA Adapters for Code Embeddings
·1678 words·8 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Max Planck Institute for Software Systems
LoRACode enhances code embeddings using LoRA, achieving SOTA in code retrieval with minimal computational cost.
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol
·3624 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Delft University of Technology
This paper reviews AI4SE benchmarks, introduces BenchScout for benchmark discovery, and proposes BenchFrame for benchmark enhancement, demonstrated via HumanEvalNext.
Learning from Failures in Multi-Attempt Reinforcement Learning
·1948 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Cambridge
Multi-attempt RL refines LLMs, significantly boosting accuracy on math tasks by enabling them to learn from failures through user feedback.
Language Models can Self-Improve at State-Value Estimation for Better Search
·2765 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Georgia Institute of Technology
Self-Taught Lookahead improves LLM search via self-supervision, matching costly methods at a fraction of the compute!
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
·2938 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Microsoft GenAI
KODCODE: A new synthetic coding dataset with verified solutions and tests, enabling state-of-the-art performance for coding LLMs.
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents
·3403 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Illinois Urbana-Champaign
MultiAgentBench: A benchmark for evaluating collaboration and competition in LLM agents across diverse, interactive scenarios with novel metrics and protocols.