Post
94
Microsoft's rStar-Math paper claims that ๐ค ~7B models can match the math skills of o1 using clever train- and test-time techniques. You can now download their prompt templates from Hugging Face !
๐ The paper introduces rStar-Math, which claims to rival OpenAI o1's math reasoning capabilities by integrating Monte Carlo Tree Search (MCTS) with step-by-step verified reasoning trajectories.
๐ค A Process Preference Model (PPM) enables fine-grained evaluation of intermediate steps, improving training data quality.
๐งช The system underwent four rounds of self-evolution, progressively refining both the policy and reward models to tackle Olympiad-level math problemsโwithout GPT-4-based data distillation.
๐พ While we wait for the release of code and datasets, you can already download the prompts they used from the HF Hub!
Details and links here ๐
Prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
Templates on the hub: MoritzLaurer/rstar-math-prompts
Prompt-templates collection: MoritzLaurer/prompt-templates-6776aa0b0b8a923957920bb4
Paper: https://arxiv.org/pdf/2501.04519
๐ The paper introduces rStar-Math, which claims to rival OpenAI o1's math reasoning capabilities by integrating Monte Carlo Tree Search (MCTS) with step-by-step verified reasoning trajectories.
๐ค A Process Preference Model (PPM) enables fine-grained evaluation of intermediate steps, improving training data quality.
๐งช The system underwent four rounds of self-evolution, progressively refining both the policy and reward models to tackle Olympiad-level math problemsโwithout GPT-4-based data distillation.
๐พ While we wait for the release of code and datasets, you can already download the prompts they used from the HF Hub!
Details and links here ๐
Prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
Templates on the hub: MoritzLaurer/rstar-math-prompts
Prompt-templates collection: MoritzLaurer/prompt-templates-6776aa0b0b8a923957920bb4
Paper: https://arxiv.org/pdf/2501.04519