🏢 AIRI

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

24 March 2025·3290 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Interpretability 🏢 AIRI

LLMs’ reasoning is decoded via sparse autoencoders, revealing key features that, when steered, enhance performance. First mechanistic account of reasoning in LLMs!

When Less is Enough: Adaptive Token Reduction for Efficient Image Representation

20 March 2025·2005 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Visual Question Answering 🏢 AIRI

Efficient image representation via adaptive token reduction.

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

20 February 2025·1710 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Interpretability 🏢 AIRI

LLMs use punctuation in context memory, surprisingly boosting performance by using seemingly trivial tokens.

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

20 February 2025·2645 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI

Packing new knowledge into LoRA adapters can harm LLMs! A delicate balance is needed to prevent performance decline.

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

18 February 2025·2814 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI

LLMs can losslessly compress 1568 tokens into a single vector, surpassing prior methods by two orders of magnitude.

Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

14 February 2025·4399 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI

MIKASA, a new benchmark for memory-intensive reinforcement learning, provides a unified framework for evaluating memory capabilities in diverse scenarios, including complex robotic manipulation tasks.

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

10 February 2025·5896 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI

SynthDetoxM generates high-quality multilingual parallel data for text detoxification using LLMs, outperforming existing datasets and models in few-shot settings.

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

22 January 2025·3632 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI

SRMT: Shared Recurrent Memory Transformer boosts multi-agent coordination by implicitly sharing information via a global memory, significantly outperforming baselines in complex pathfinding tasks.

3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding

24 December 2024·3344 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 AIRI

3DGraphLLM boosts 3D scene understanding by cleverly merging semantic graphs and LLMs, enabling more accurate scene descriptions and outperforming existing methods.