🏢 the Ohio State University
Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization
·2486 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 the Ohio State University
Transformers can learn implicit reasoning through ‘grokking’, achieving high accuracy in composition and comparison tasks; however, generalization varies across reasoning types.