↓Skip to main content

🏢 National Key Laboratory for Novel Software Technology, Nanjing University

Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation

26 September 2024·328 words·2 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 National Key Laboratory for Novel Software Technology, Nanjing University

This paper presents novel RL algorithms using multinomial logit function approximation, achieving O(1) computation and storage while nearly closing the regret gap with linear methods.

Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction

26 September 2024·1787 words·9 mins· loading · loading

Machine Learning Optimization 🏢 National Key Laboratory for Novel Software Technology, Nanjing University

Sign-based optimization gets a speed boost! This paper introduces new algorithms that significantly accelerate convergence in distributed optimization by cleverly using variance reduction and enhanced…

DiffuLT: Diffusion for Long-tail Recognition Without External Knowledge

26 September 2024·2601 words·13 mins· loading · loading

Computer Vision Image Classification 🏢 National Key Laboratory for Novel Software Technology, Nanjing University

DiffuLT uses a novel diffusion model to generate balanced training data from imbalanced datasets, achieving state-of-the-art results in long-tailed image recognition without external knowledge.

Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees

26 September 2024·1987 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 National Key Laboratory for Novel Software Technology, Nanjing University

TP-LLaMA boosts tool-augmented LLMs by optimizing inference trajectories using preference learning from both successful and failed attempts, achieving superior performance and efficiency.