Skip to main content

🏢 AIRI

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
·3290 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Interpretability 🏢 AIRI
LLMs’ reasoning is decoded via sparse autoencoders, revealing key features that, when steered, enhance performance. First mechanistic account of reasoning in LLMs!
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation
·2005 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Visual Question Answering 🏢 AIRI
Efficient image representation via adaptive token reduction.
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
·1710 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Interpretability 🏢 AIRI
LLMs use punctuation in context memory, surprisingly boosting performance by using seemingly trivial tokens.
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
·2645 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI
Packing new knowledge into LoRA adapters can harm LLMs! A delicate balance is needed to prevent performance decline.
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
·2814 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI
LLMs can losslessly compress 1568 tokens into a single vector, surpassing prior methods by two orders of magnitude.
Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning
·4399 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI
MIKASA, a new benchmark for memory-intensive reinforcement learning, provides a unified framework for evaluating memory capabilities in diverse scenarios, including complex robotic manipulation tasks.
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators
·5896 words·28 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI
SynthDetoxM generates high-quality multilingual parallel data for text detoxification using LLMs, outperforming existing datasets and models in few-shot settings.
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
·3632 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI
SRMT: Shared Recurrent Memory Transformer boosts multi-agent coordination by implicitly sharing information via a global memory, significantly outperforming baselines in complex pathfinding tasks.
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding
·3344 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 AIRI
3DGraphLLM boosts 3D scene understanding by cleverly merging semantic graphs and LLMs, enabling more accurate scene descriptions and outperforming existing methods.