137 15 332

Joseph

Joseph717171

AI & ML interests

None yet

Recent Activity

reacted to merve's post with ❤️ about 20 hours ago

there's a new multimodal retrieval model in town 🤠 LlamaIndex released vdr-2b-multi-v1 > uses 70% less image tokens, yet outperforming other dse-qwen2 based models > 3x faster inference with less VRAM 💨 > shrinkable with matryoshka 🪆 > can do cross-lingual retrieval! Collection: https://huggingface.co/collections/llamaindex/visual-document-retrieval-678151d19d2758f78ce910e1 (with models and datasets) Demo: https://huggingface.co/spaces/llamaindex/multimodal_vdr_demo Learn more from their blog post here https://huggingface.co/blog/vdr-2b-multilingual 📖

updated a model 4 days ago

Joseph717171/Hermes-3-Llama-3.1-8B-OQ8_0-F32.EF32.IQ4_K-Q8_0-GGUF

updated a model 5 days ago

Joseph717171/Models

View all activity

Organizations

Joseph717171's activity

reacted to merve's post with ❤️ about 20 hours ago

Post

3498

there's a new multimodal retrieval model in town 🤠
LlamaIndex released vdr-2b-multi-v1
> uses 70% less image tokens, yet outperforming other dse-qwen2 based models
> 3x faster inference with less VRAM 💨
> shrinkable with matryoshka 🪆
> can do cross-lingual retrieval!
Collection: llamaindex/visual-document-retrieval-678151d19d2758f78ce910e1 (with models and datasets)
Demo: llamaindex/multimodal_vdr_demo
Learn more from their blog post here https://huggingface.co/blog/vdr-2b-multilingual 📖

updated a model 4 days ago

Joseph717171/Hermes-3-Llama-3.1-8B-OQ8_0-F32.EF32.IQ4_K-Q8_0-GGUF

Updated 4 days ago • 750 • 2

updated a model 5 days ago

Joseph717171/Models

Updated 5 days ago • 485 • 3

New activity in Undi95/Phi4-abliterated 6 days ago

Awesome work, Undi95! This looks great!

#1 opened 6 days ago by

Joseph717171

liked a model 6 days ago

Undi95/Phi4-abliterated

Updated 6 days ago • 390 • 7

reacted to Tonic's post with 🚀🔥 6 days ago

Post

1598

microsoft just released Phi-4 , check it out here : Tonic/Phi-4

hope you like it :-)

liked a model 6 days ago

microsoft/phi-4

Text Generation • Updated 7 days ago • 72.3k • 1.3k

upvoted a paper 6 days ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published 12 days ago • 78

liked 2 models 8 days ago

nomic-ai/nomic-embed-text-v1.5

nomic-ai/modernbert-embed-base

New activity in cognitivecomputations/Dolphin3.0-Llama3.1-8B 10 days ago

Great Model Base for ERP!

#1 opened 10 days ago by

Joseph717171

liked a model 10 days ago

cognitivecomputations/Dolphin3.0-Llama3.1-8B

Updated 10 days ago • 2.05k • 112

liked a model 14 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 2 days ago • 15.6k • 1.43k

liked a model 21 days ago

NyxKrage/Microsoft_Phi-4

Updated Dec 13, 2024 • 7.06k • 54

upvoted a paper 21 days ago

Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published 23 days ago • 29

liked 2 models 23 days ago

black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 628k • • 3.23k

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Aug 16, 2024 • 1.27M • • 8.02k

reacted to singhsidhukuldeep's post with 🚀🧠 24 days ago

Post

3638

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

3 replies