🏢 School of Computer Science and Technology, University of Science and Technology of China
Top-$nσ$: Not All Logits Are You Need
·2189 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 School of Computer Science and Technology, University of Science and Technology of China
Top-ησ: A novel LLM sampling method outperforms existing approaches by using a statistical threshold on pre-softmax logits, achieving higher accuracy while maintaining diversity, even at high temperat…