BEEspoke Data

community

https://www.bees.org/

Activity Feed

AI & ML interests

'an LLM is only as good as the dataset it was trained on' - Sun Tzu

Recent Activity

pszemraj updated a dataset 3 days ago

BEE-spoke-data/LONGCOT-merged-en

pszemraj updated a dataset 3 days ago

BEE-spoke-data/LONGCOT-merged-1M

pszemraj updated a dataset 12 days ago

BEE-spoke-data/govdocs1-by-extension

View all activity

BEE-spoke-data's activity

pszemraj

updated 2 datasets 3 days ago

BEE-spoke-data/LONGCOT-merged-en

Viewer • Updated 3 days ago • 679k • 14

BEE-spoke-data/LONGCOT-merged-1M

Viewer • Updated 3 days ago • 1.7M • 14 • 1

pszemraj

updated a dataset 12 days ago

BEE-spoke-data/govdocs1-by-extension

Viewer • Updated 12 days ago • 733k • 905 • 1

pszemraj

updated 3 datasets 13 days ago

pszemraj

updated a dataset 26 days ago

BEE-spoke-data/google_wellformed_query-hf

Viewer • Updated 26 days ago • 25.1k • 36

pszemraj

updated a Space about 1 month ago

Running

🐝

BeeCoder Demo

pszemraj

updated 2 datasets about 2 months ago

BEE-spoke-data/fingpt-all-pr_format

Viewer • Updated Nov 26, 2024 • 418k • 46 • 1

BEE-spoke-data/fingpt-all

Viewer • Updated Nov 26, 2024 • 418k • 53

qnguyen3

posted an update 7 months ago

Post

3953

nanoLLaVA-1.5 is here! Same size (1B), better performance 🔥🔥🔥
And it is much more powerful than v1.0
Try it out now on HF Spaces: qnguyen3/nanoLLaVA
Model: qnguyen3/nanoLLaVA-1.5

3 replies

huu-ontocord

authored a paper 9 months ago

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

Paper • 2404.08676 • Published Apr 6, 2024 • 3

qnguyen3

posted an update 9 months ago

Post

5368

🎉 Introducing nanoLLaVA, a powerful multimodal AI model that packs the capabilities of a 1B parameter vision language model into just 5GB of VRAM. 🚀 This makes it an ideal choice for edge devices, bringing cutting-edge visual understanding and generation to your devices like never before. 📱💻

Model: qnguyen3/nanoLLaVA 🔍
Spaces: qnguyen3/nanoLLaVA (thanks to @merve )

Under the hood, nanoLLaVA is based on the powerful vilm/Quyen-SE-v0.1 (my Qwen1.5-0.5B finetune) and Google's impressive google/siglip-so400m-patch14-384. 🧠 The model is trained using a data-centric approach to ensure optimal performance. 📊

In the spirit of transparency and collaboration, all code and model weights are open-sourced under the Apache 2.0 license. 🤝