Zhiheng Xi's picture

1 2

Zhiheng Xi

WooooDyy

·

AI & ML interests

None yet

Recent Activity

authored a paper 8 days ago

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use

updated a dataset about 2 months ago

MathCritique/MathCritique-76k

authored a paper 3 months ago

TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

View all activity

Organizations

Papers 17

arxiv:2501.02506

arxiv:2410.18798

arxiv:2408.14874

arxiv:2406.04151

models

None public yet

datasets

None public yet