Skip to main content

🏢 School of Computer Science and Technology, University of Science and Technology of China

Top-$nσ$: Not All Logits Are You Need
·2189 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 School of Computer Science and Technology, University of Science and Technology of China
Top-ησ: A novel LLM sampling method outperforms existing approaches by using a statistical threshold on pre-softmax logits, achieving higher accuracy while maintaining diversity, even at high temperat…