Could you clarify whether the PRM800K deduplication was performed using the original 5000-test set from MATH or the MATH500 dataset?
#2
by
masterLan
- opened
Could you clarify whether the PRM800K deduplication was performed using the original 5000-test set from MATH or the MATH500 dataset?
The original 5000-test set from MATH.
The original 5000-test set from MATH.
So your MATH scores are on the full MATH test set?
The original 5000-test set from MATH.
So your MATH scores are on the full MATH test set?
Yes. We follow the setting of Qwen2.5-Math as described in the paper.