arxiv:2412.04445
Yixiao Ge
yxgeee
AI & ML interests
Computer Vision, Foundation Models
Recent Activity
authored
a paper
about 1 month ago
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
authored
a paper
about 1 month ago
Moto: Latent Motion Token as the Bridging Language for Robot
Manipulation
authored
a paper
4 months ago
Open-MAGVIT2: An Open-Source Project Toward Democratizing
Auto-regressive Visual Generation
Organizations
Papers
18
models
None public yet
datasets
None public yet