Spaces:
Running
on
CPU Upgrade
Updated precision to bfloat16 and use_chat_template to false for pankajmathur/orca_mini_v8_0_70b and pankajmathur/orca_mini_v8_1_70b
Hi @alozowski and Team,
First of all, Great work on new UI of Open LLM LB, It looks stunning.
I submitted 2 of the new series of Orca_Mini_v8_* models fine tuned on Llama-3.3-70B-Instruct for evaluation via UI but initially used wrong precision and chat_template flag.
Now, I have opened 2 MR for these 2 models to fix these mistakes, could you please have a look and Let me know, if you need additional details on this:
- https://huggingface.co/datasets/open-llm-leaderboard/requests/discussions/74/
- https://huggingface.co/datasets/open-llm-leaderboard/requests/discussions/75/
Regards,
Pankaj
Hi @pankajmathur ,
Thanks for opening the issue! I corrected both of your requests manually, it should be fine now
I'm closing this discussion, feel free to open a new one in case of any questions
Thank You for swift turnaround, appreciated.
Hi @alozowski ,
Happy Monday, just reaching out to make sense out of following eval requests commits for model "pankajmathur/orca_mini_v8_0_70b", the below commit shows file rename and changes from wrong "params": 35.277,
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/5660c4c4b9156fa0f15d99be7eee061d5de24764#d2h-741276
Does the model failed to evaluate and these changes reflect re submission for evaluation again?
If it is true, can we submit "pankajmathur/orca_mini_v8_1_70b" again too, It shows it is failed too?
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/8b40ba212c48dc470be4f661b67cc085ed456477#d2h-702908
Is there any reason they are failing? Just for background, I have successfully evaluated both of them on my own servers, before submitting them to HF Open LLM LB, using:
https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about#reproducibility
lm_eval --model hf --model_args pretrained=pankajmathur/orca_mini_v8_1_70b,dtype=bfloat16,parallelize=True --tasks leaderboard --output_path lm_eval_results/leaderboard --batch_size auto
and they are updated for both model cards:
https://huggingface.co/pankajmathur/orca_mini_v8_0_70b
https://huggingface.co/pankajmathur/orca_mini_v8_1_70b
Again, thanks again for helping out on this really appreciated.
Regards,
Pankaj
Hi @pankajmathur ,
Apologies for the delayed response, it was the New Year holidays, but I’m glad to get back to discussions!
Does the model failed to evaluate and these changes reflect re submission for evaluation again?
Yes, unfortunately, the model failed due to an incorrect detection of the number of parameters. On the next attempt, it failed again due to a network error. I’ve resubmitted it and I hope it will work fine this time
Closing this discussion again, feel free to ping me in case of any questions about this model or please open a new discussion!
No Worries at all, Happy New year to you too :)
I saw "https://huggingface.co/pankajmathur/orca_mini_v8_1_70b" under queue so thank you! I think we need to open the thread again, as "https://huggingface.co/pankajmathur/orca_mini_v8_0_70b" was also failed so Could we please resubmit that too for the queue? please close this thread once it is submitted too.
Thanks again for all the cool work on LB Filters, its is becoming my go to tool for various LLM analysis.
Regards,
Pankaj