Fernando Molina's picture

3 28

Fernando Molina

TheFernandoM

·

AI & ML interests

None yet

Recent Activity

liked a model about 16 hours ago

Zhenggang/MV-DUSt3R

liked a model about 17 hours ago

MAmmoTH-VL/MAmmoTH-VL-8B-SI

liked a dataset about 17 hours ago

MAmmoTH-VL/MAmmoTH-VL-Instruct-12M

View all activity

Organizations

TheFernandoM's activity

liked a model about 16 hours ago

Zhenggang/MV-DUSt3R

Image-to-3D • Updated 4 days ago • 6

liked a model about 17 hours ago

MAmmoTH-VL/MAmmoTH-VL-8B-SI

Updated Dec 11, 2024 • 49 • 2

liked a dataset about 17 hours ago

MAmmoTH-VL/MAmmoTH-VL-Instruct-12M

Viewer • Updated 10 days ago • 37M • 5.05k • 40

reacted to Jaward's post with 👀 about 18 hours ago

Post

1186

minimal single script implementation of knowledge distillation in LLMs. In this implementation, we use GPT-2 (124M) as student model and GPT-2 Medium (340M) as teacher via reverse Kullback-Leibler (KL) divergence, trained on a small chunk of openwebtext.

Code: https://github.com/Jaykef/ai-algorithms/blob/main/llm_knowledge_distillation.ipynb

liked a model about 21 hours ago

MAmmoTH-VL/MAmmoTH-VL-8B

Updated Dec 9, 2024 • 457 • 16

liked a dataset about 21 hours ago

TIGER-Lab/VisualWebInstruct2

Updated about 17 hours ago • 162 • 2

liked a Space about 21 hours ago

Running on Zero

NamedCurves

liked a dataset about 21 hours ago

TIGER-Lab/MathInstruct

Viewer • Updated May 15, 2024 • 262k • 3.35k • 260

liked 2 models about 21 hours ago

Qwen/Qwen2-VL-72B-Instruct

Image-Text-to-Text • Updated 3 days ago • 139k • 258

suayptalha/Komodo-7B-Instruct

Text Generation • Updated 24 days ago • 163 • 6

upvoted a collection about 22 hours ago

Llama 3.3

This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 110

liked a model about 22 hours ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated 25 days ago • 460k • • 1.63k

liked a dataset about 22 hours ago

NovaSky-AI/labeled_numina_difficulty

Viewer • Updated 1 day ago • 162k • 20 • 3

liked a model about 22 hours ago

NovaSky-AI/Sky-T1-32B-Preview

Text Generation • Updated 2 days ago • 2.88k • 361

upvoted a collection about 22 hours ago

Phi-4 (All Versions)

Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 4 items • Updated 3 days ago • 27

liked a dataset about 22 hours ago

meta-llama/Llama-3.3-70B-Instruct-evals

Viewer • Updated Dec 6, 2024 • 41.3k • 152 • 21

liked a model about 22 hours ago

unsloth/phi-4-GGUF

Text Generation • Updated 2 days ago • 25.4k • 82

liked a model about 23 hours ago

nvidia/ssl_en_nest_large_v1.0

Updated about 11 hours ago • 126 • 5

liked 2 models 1 day ago

ntc-ai/SDXL-LoRA-slider.retro-horror-comic-style-poster

Text-to-Image • Updated Jan 15, 2024 • 62 • 3

joachimsallstrom/aether-cloud-lora-for-sdxl

Text-to-Image • Updated Sep 6, 2023 • 492 • • 9