Could you clarify whether the PRM800K deduplication was performed using the original 5000-test set from MATH or the MATH500 dataset?

#2
by masterLan - opened

Could you clarify whether the PRM800K deduplication was performed using the original 5000-test set from MATH or the MATH500 dataset?

The original 5000-test set from MATH.

The original 5000-test set from MATH.

So your MATH scores are on the full MATH test set?

The original 5000-test set from MATH.

So your MATH scores are on the full MATH test set?

Yes. We follow the setting of Qwen2.5-Math as described in the paper.

Sign up or log in to comment