-
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Paper • 2407.15841 • Published • 40 -
Stable Audio Open
Paper • 2407.14358 • Published • 25 -
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
Paper • 2407.13976 • Published • 5 -
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Paper • 2407.14329 • Published • 5
Collections
Discover the best community collections!
Collections including paper arxiv:2407.13976
-
YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals
Paper • 2406.16273 • Published • 41 -
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
Paper • 2407.06191 • Published • 12 -
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
Paper • 2407.13976 • Published • 5 -
MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
Paper • 2411.17945 • Published • 23
-
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Paper • 2403.01807 • Published • 8 -
TripoSR: Fast 3D Object Reconstruction from a Single Image
Paper • 2403.02151 • Published • 13 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 29 -
MagicClay: Sculpting Meshes With Generative Neural Fields
Paper • 2403.02460 • Published • 6
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 13 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 21 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 7