1 | Literature: |
1 | Literature: |
GRPO: The Key Engine Driving DeepSeek's Exceptional Performance
1 | Paper: DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models |
Inspecting and Editing Knowledge Representations in Language Models
论文提出了一种名为REMEDI(Representation Mediation)的方法,通过学习将自然语言中的陈述映射到语言模型内部表示系统中的事实编码。