Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published May 23, 2024 • 28
CAT3D: Create Anything in 3D with Multi-View Diffusion Models Paper • 2405.10314 • Published May 16, 2024 • 46
MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels Paper • 2405.07526 • Published May 13, 2024 • 18
LogoMotion: Visually Grounded Code Generation for Content-Aware Animation Paper • 2405.07065 • Published May 11, 2024 • 17
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published Apr 30, 2024 • 15
Stylus: Automatic Adapter Selection for Diffusion Models Paper • 2404.18928 • Published Apr 29, 2024 • 15
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models Paper • 2404.17672 • Published Apr 26, 2024 • 19
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published Apr 25, 2024 • 17
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published Apr 25, 2024 • 56
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Paper • 2404.16994 • Published Apr 25, 2024 • 36
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections Paper • 2404.16845 • Published Feb 14, 2024 • 7
Interactive3D: Create What You Want by Interactive 3D Generation Paper • 2404.16510 • Published Apr 25, 2024 • 19
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models Paper • 2404.14507 • Published Apr 22, 2024 • 22