-
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
Paper • 2401.10529 • Published • 1 -
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Paper • 2311.12793 • Published • 18 -
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 26 -
SVIT: Scaling up Visual Instruction Tuning
Paper • 2307.04087 • Published • 6
Sulabh
sulabh-research
AI & ML interests
None yet
Recent Activity
upvoted
a
collection
4 days ago
Synthetic Data Generation
upvoted
a
collection
8 months ago
🤖 Agents
upvoted
a
collection
8 months ago
🚀 Spinning Up in LLMs
Organizations
None yet
Collections
10
models
None public yet
datasets
None public yet