SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published Apr 30, 2024 • 15
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 2 days ago • 62
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published 5 days ago • 56
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published 16 days ago • 33
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 22 days ago • 68
Perceiver: General Perception with Iterative Attention Paper • 2103.03206 • Published Mar 4, 2021 • 1
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 7 days ago • 218
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published 8 days ago • 61
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 12 days ago • 78
High-Fidelity Audio Compression with Improved RVQGAN Paper • 2306.06546 • Published Jun 11, 2023 • 10
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning Paper • 1907.04448 • Published Jul 9, 2019 • 1
SDPO: Segment-Level Direct Preference Optimization for Social Agents Paper • 2501.01821 • Published 12 days ago • 18
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published 12 days ago • 38
Fewer-token Neural Speech Codec with Time-invariant Codes Paper • 2310.00014 • Published Sep 15, 2023 • 2
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning Paper • 2412.15797 • Published 26 days ago • 17
PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models Paper • 2412.18608 • Published 22 days ago • 14