MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published about 19 hours ago • 174
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 4 days ago • 19
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper • 2501.05707 • Published 5 days ago • 16
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding Paper • 2501.05452 • Published 6 days ago • 12
Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models Paper • 2501.05767 • Published 5 days ago • 24
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published 5 days ago • 56
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 5 days ago • 51
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model Paper • 2501.05122 • Published 6 days ago • 18
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 6 days ago • 74
Multi-task retriever fine-tuning for domain-specific and efficient RAG Paper • 2501.04652 • Published 7 days ago • 9
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 6 days ago • 63
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 7 days ago • 47
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 7 days ago • 218
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 7 days ago • 78
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 7 days ago • 75
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published 8 days ago • 48
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models Paper • 2501.00874 • Published 14 days ago • 12
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published 12 days ago • 31
Unifying Specialized Visual Encoders for Video Language Models Paper • 2501.01426 • Published 13 days ago • 20