110 58 77

Hugo Laurençon

HugoLaurencon

HugoLaurencon

AI & ML interests

None yet

Recent Activity

upvoted a paper about 23 hours ago

Tensor Product Attention Is All You Need

upvoted a paper 12 days ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

liked a dataset 12 days ago

DAMO-NLP-SG/multimodal_textbook

View all activity

Articles

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18, 2024

• 72

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15, 2024

• 171

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Mar 15, 2024

• 7

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Aug 22, 2023

• 28

Putting ethical principles at the core of research lifecycle

May 19, 2022

Organizations

HugoLaurencon's activity

upvoted a paper about 23 hours ago

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published 4 days ago • 46

upvoted a paper 12 days ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 14 days ago • 93

liked a dataset 12 days ago

DAMO-NLP-SG/multimodal_textbook

Updated 4 days ago • 7.21k • 103

New activity in HuggingFaceM4/idefics2-8b 13 days ago

Seems like the user prompt is ignored

#80 opened 26 days ago by

jlmeunier

New activity in OS-Copilot/OS-Genesis-7B-AC 13 days ago

Permission error to access data

#1 opened 13 days ago by

HugoLaurencon

upvoted 2 papers 23 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 27 days ago • 340

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 124

New activity in HuggingFaceM4/idefics2-8b 23 days ago

Seems like the user prompt is ignored

#80 opened 26 days ago by

jlmeunier

upvoted 2 papers about 1 month ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 138

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 102

commented a paper about 1 month ago

CompCap: Improving Multimodal Large Language Models with Composite Captions

Paper • 2412.05243 • Published Dec 6, 2024 • 18 •

upvoted 2 papers about 1 month ago

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Paper • 2412.04626 • Published Dec 5, 2024 • 13

HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing

Paper • 2412.04280 • Published Dec 5, 2024 • 13

liked a model about 2 months ago

Qwen/QwQ-32B-Preview

Text Generation • Updated 4 days ago • 145k • • 1.55k