Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference status
Reset Inference status
Warm
Cold
Frozen
Misc
Reset Misc
multimodal
Inference Endpoints
text-generation-inference
AutoTrain Compatible
custom_code
4-bit precision
Merge
Eval Results
8-bit precision
Mixture of Experts
Misc with no match
text-embeddings-inference
Carbon Emissions
Apply filters
Models
422
Full-text search
Edit filters
Sort: Trending
Active filters:
multimodal
Clear all
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text
•
Updated
4 days ago
•
1.69M
•
•
1.05k
erax-ai/EraX-VL-7B-V2.0-Preview
Visual Question Answering
•
Updated
3 days ago
•
256
•
15
Qwen/Qwen2-VL-2B-Instruct
Image-Text-to-Text
•
Updated
4 days ago
•
1.85M
•
369
Qwen/Qwen2-VL-72B-Instruct
Image-Text-to-Text
•
Updated
4 days ago
•
139k
•
258
allenai/Molmo-7B-D-0924
Image-Text-to-Text
•
Updated
Oct 10, 2024
•
598k
•
494
GoodiesHere/Apollo-LMMs-Apollo-7B-t32
Video-Text-to-Text
•
Updated
28 days ago
•
1.08k
•
50
jinaai/jina-clip-v2
Zero-Shot Image Classification
•
Updated
2 days ago
•
27.6k
•
160
lmms-lab/LLaVA-Video-7B-Qwen2
Video-Text-to-Text
•
Updated
Oct 25, 2024
•
70.7k
•
60
nvidia/NVLM-D-72B-mcore
Image-Text-to-Text
•
Updated
about 20 hours ago
•
7
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448
Video-Text-to-Text
•
Updated
about 8 hours ago
•
174
•
4
Qwen/Qwen2-VL-7B
Image-Text-to-Text
•
Updated
3 days ago
•
11.6k
•
32
nvidia/NVLM-D-72B
Image-Text-to-Text
•
Updated
about 20 hours ago
•
11.2k
•
763
NexaAIDev/OmniVLM-968M
Updated
29 days ago
•
1.36k
•
495
osunlp/UGround-V1-7B
Image-Text-to-Text
•
Updated
8 days ago
•
683
•
4
OpenGVLab/VideoChat-Flash-Qwen2-7B_res448
Video-Text-to-Text
•
Updated
about 8 hours ago
•
249
•
3
robotics-diffusion-transformer/rdt-1b
Robotics
•
Updated
Oct 17, 2024
•
3.23k
•
64
Qwen/Qwen2-VL-2B-Instruct-AWQ
Image-Text-to-Text
•
Updated
Sep 21, 2024
•
5.73k
•
20
rhymes-ai/Aria
Image-Text-to-Text
•
Updated
29 days ago
•
27.8k
•
604
erax-ai/EraX-VL-2B-V1.5
Visual Question Answering
•
Updated
6 days ago
•
1.06k
•
5
unsloth/Pixtral-12B-2409
Image-Text-to-Text
•
Updated
Nov 21, 2024
•
907
•
2
AI-Safeguard/Ivy-VL-llava
Visual Question Answering
•
Updated
16 days ago
•
2.8k
•
57
bartowski/Qwen2-VL-7B-Instruct-GGUF
Image-Text-to-Text
•
Updated
29 days ago
•
14.8k
•
26
bartowski/Qwen2-VL-72B-Instruct-GGUF
Image-Text-to-Text
•
Updated
29 days ago
•
3.97k
•
8
osunlp/UGround-V1-2B
Image-Text-to-Text
•
Updated
8 days ago
•
685
•
6
osunlp/UGround-V1-72B-Preview
Image-Text-to-Text
•
Updated
3 days ago
•
705
•
2
osunlp/UGround-V1-72B
Image-Text-to-Text
•
Updated
3 days ago
•
102
•
2
imageomics/bioclip
Zero-Shot Image Classification
•
Updated
May 17, 2024
•
2.63k
•
44
marcosv/InstructIR
Image-to-Image
•
Updated
Jan 31, 2024
•
29
nielsr/imagebind-huge
Updated
Apr 28, 2024
•
619
•
9
qnguyen3/nanoLLaVA
Text Generation
•
Updated
Oct 27, 2024
•
16.4k
•
153
Previous
1
2
3
...
15
Next