arxiv:2412.17256
WeihaoZeng
AndrewZeng
AI & ML interests
None yet
Recent Activity
authored
a paper
about 5 hours ago
B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners
upvoted
a
paper
5 days ago
Search-o1: Agentic Search-Enhanced Large Reasoning Models
upvoted
a
paper
7 days ago
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Organizations
models
3
datasets
60
AndrewZeng/math-bstar-sample
Viewer
•
Updated
•
11.5k
•
13
AndrewZeng/bstar-math-dev
Viewer
•
Updated
•
604
•
38
AndrewZeng/prm-reward-data
Viewer
•
Updated
•
240k
•
31
AndrewZeng/math-trn-format
Viewer
•
Updated
•
11.5k
•
39
AndrewZeng/math_scaling
Viewer
•
Updated
•
100
•
23
AndrewZeng/random_syn
Viewer
•
Updated
•
108k
•
12
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_4
Viewer
•
Updated
•
38.9k
•
31
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_3
Viewer
•
Updated
•
38.9k
•
33
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_2
Viewer
•
Updated
•
38.9k
•
29
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_1
Viewer
•
Updated
•
38.9k
•
28