Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published 8 days ago • 40
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published 8 days ago • 61
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 12 days ago • 79
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models Paper • 2501.03124 • Published 9 days ago • 13
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 19 days ago • 78
M-STAR Collection Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/ • 2 items • Updated 21 days ago • 2
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 23 days ago • 42
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 23 days ago • 42 • 2