Vision - a diwank Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

diwank 's Collections

F

search

Vision

Art

K

S1.1

Sam

Audio

thought

Vision

updated 1 day ago

apple/DepthPro

Depth Estimation • Updated Oct 9, 2024 • 1.95k • 386
rhymes-ai/Aria

Image-Text-to-Text • Updated 29 days ago • 27.8k • 604
mit-han-lab/hart-0.7b-1024px

Unconditional Image Generation • Updated Nov 17, 2024 • 9
deepseek-ai/Janus-1.3B

Any-to-Any • Updated Nov 14, 2024 • 10.8k • 503
neulab/PangeaInstruct

Updated Oct 25, 2024 • 391 • 78
genmo/mochi-1-preview

Text-to-Video • Updated 28 days ago • 42k • 1.14k
stabilityai/stable-diffusion-3.5-large

Text-to-Image • Updated Oct 22, 2024 • 123k • • 1.89k
Freepik/flux.1-lite-8B-alpha

Text-to-Image • Updated 16 days ago • 4.14k • 403
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 1.11k • 1.52k
mistralai/Pixtral-12B-Base-2409

Updated Oct 30, 2024 • 70
neulab/Pangea-7B

Updated Oct 24, 2024 • 7.73k • 122
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • Updated 7 days ago • 445 • 60
OpenGVLab/InternVL2-1B

Image-Text-to-Text • Updated 28 days ago • 59.6k • 59
OpenGVLab/InternVL2-2B

Image-Text-to-Text • Updated 28 days ago • 69.3k • 65
OpenGVLab/Mono-InternVL-2B

Image-Text-to-Text • Updated Nov 21, 2024 • 3.89k • 30
OpenGVLab/OmniCorpus-YT

Updated Nov 17, 2024 • 527 • 9
OpenGVLab/OmniCorpus-CC-210M

Viewer • Updated Nov 17, 2024 • 208M • 381 • 19
OpenGVLab/OmniCorpus-CC

Viewer • Updated Nov 17, 2024 • 986M • 20.5k • 12
OpenGVLab/InternVideo2_chat_8B_HD

Video-Text-to-Text • Updated 28 days ago • 684 • 17
OpenGVLab/ViCLIP

Updated Jun 7, 2024 • 33
OpenGVLab/ASMv2

Text Generation • Updated Feb 29, 2024 • 70 • 17
OpenGVLab/VideoChat2-IT

Viewer • Updated Jun 29, 2024 • 1.82M • 666 • 47
NimVideo/cogvideox-2b-img2vid

Image-to-Video • Updated Oct 28, 2024 • 260 • 63
BAAI/Infinity-MM

Updated Dec 13, 2024 • 17.6k • 87
nvidia/RADIO-H

Updated Dec 2, 2024 • 2.26k • 9
Spawning/PD12M

Viewer • Updated 6 days ago • 12.4M • 1.43k • 146
Shitao/OmniGen-v1

Text-to-Image • Updated Nov 7, 2024 • 9.43k • 280
InstantX/InstantIR

Image-to-Image • Updated Nov 7, 2024 • 5 • 163
nvidia/Cosmos-0.1-Tokenizer-DI8x8

Updated 21 days ago • 293 • 9
BAAI/Emu3-Chat

Text Generation • Updated Oct 24, 2024 • 829 • 71
briaai/RMBG-2.0

Image Segmentation • Updated 23 days ago • 309k • 577
Watermark Anything with Localized Messages

Paper • 2411.07231 • Published Nov 11, 2024 • 20
rain1011/pyramid-flow-miniflux

Text-to-Video • Updated Nov 13, 2024 • 160
OpenGVLab/InternVL2-8B-MPO

Image-Text-to-Text • Updated 26 days ago • 1.45k • 34
mistralai/Pixtral-Large-Instruct-2411

Image-Text-to-Text • Updated 20 days ago • 4 • 388
briaai/BRIA-2.3

Text-to-Image • Updated Nov 19, 2024 • 395 • 31
microsoft/Reducio-VAE

Updated Nov 21, 2024 • 10 • 15
Lightricks/LTX-Video

Image-to-Video • Updated 27 days ago • 88.4k • 860
apple/aimv2-3B-patch14-448

Image Feature Extraction • Updated Nov 28, 2024 • 391 • 8
THUdyh/Insight-V-Reason

Text Generation • Updated Nov 22, 2024 • 31 • 9
black-forest-labs/FLUX.1-Fill-dev

Updated Nov 25, 2024 • 37.5k • 468
Efficient-Large-Model/Sana_1600M_512px

Text-to-Image • Updated 5 days ago • 602 • 37
Efficient-Large-Model/Sana_1600M_1024px

Text-to-Image • Updated 5 days ago • 4.78k • 169
AIDC-AI/Ovis1.6-Gemma2-27B

Image-Text-to-Text • Updated Dec 10, 2024 • 991 • 58
HuggingFaceTB/SmolVLM-Base

Image-Text-to-Text • Updated Nov 28, 2024 • 8.14k • 51
THUDM/glm-edge-v-5b

Image-Text-to-Text • Updated 13 days ago • 131 • 11
rhymes-ai/Aria-Base-64K

Image-Text-to-Text • Updated Dec 1, 2024 • 3.62k • 12
allenai/pixmo-point-explanations

Viewer • Updated Dec 5, 2024 • 79.6k • 244 • 6
tencent/HunyuanVideo

Text-to-Video • Updated 28 days ago • 8.59k • 1.42k
tencent/HunyuanVideo-PromptRewrite

Updated Dec 6, 2024 • 79 • 41
google/paligemma2-28b-pt-896

Image-Text-to-Text • Updated Dec 5, 2024 • 880 • 45
OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • Updated 28 days ago • 6.95k • 160
MAmmoTH-VL/MAmmoTH-VL-8B

Updated Dec 9, 2024 • 457 • 16
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M

Viewer • Updated 10 days ago • 37M • 5.05k • 40
OpenGVLab/PVC-InternVL2-8B

Image-Text-to-Text • Updated 29 days ago • 64 • 9
BGLab/BioTrove

Viewer • Updated Dec 13, 2024 • 163M • 602 • 7
TencentARC/NVComposer

Image-to-3D • Updated about 1 month ago • 182 • 7
deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated 28 days ago • 2.37k • 133
FastVideo/FastHunyuan

Text-to-Video • Updated 8 days ago • 792 • 148
BAAI/nova-d48w1536-sdxl1024

Text-to-Image • Updated 25 days ago • 53 • 7
IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated 21 days ago • 17.8k • 579
Infinigence/Megrez-3B-Omni

Updated 29 days ago • 744 • 124
microsoft/VidTok

Updated 1 day ago • 27
TIGER-Lab/Mantis-8B-siglip-llama3

Image-Text-to-Text • Updated Nov 15, 2024 • 12k • 32
OpenGVLab/HoVLE-HD

Image-Text-to-Text • Updated 22 days ago • 100 • 7
nyu-visionx/cambrian-34b

Text Generation • Updated Jun 28, 2024 • 53 • 28
nyu-visionx/cambrian-phi3-3b

Text Generation • Updated Jul 6, 2024 • 47 • 11
nyu-visionx/Cambrian-Alignment

Viewer • Updated Jul 23, 2024 • 292k • 1.19k • 32
nvidia/Cosmos-1.0-Autoregressive-13B-Video2World

Updated 5 days ago • 719 • 26
nvidia/Cosmos-1.0-Diffusion-14B-Video2World

Updated 5 days ago • 1.73k • 42
nvidia/Cosmos-1.0-Diffusion-14B-Text2World

Updated 5 days ago • 40.3k • 36
nvidia/Cosmos-1.0-Autoregressive-12B

Updated 5 days ago • 643 • 24
StephanST/WALDO30

Object Detection • Updated Oct 9, 2024 • 203
ByteDance/Sa2VA-8B

Image-Text-to-Text • Updated 1 day ago • 754 • 35
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448

Video-Text-to-Text • Updated about 8 hours ago • 174 • 4
OpenGVLab/VideoMAEv2-giant

Video Classification • Updated 1 day ago • 6 • 1

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs