Joseph

Joseph717171

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face Discord Community's profile picture

Joseph717171's activity

reacted to merve's post with ā¤ļø about 20 hours ago
view post
Post
3498
there's a new multimodal retrieval model in town šŸ¤ 
LlamaIndex released vdr-2b-multi-v1
> uses 70% less image tokens, yet outperforming other dse-qwen2 based models
> 3x faster inference with less VRAM šŸ’Ø
> shrinkable with matryoshka šŸŖ†
> can do cross-lingual retrieval!
Collection: llamaindex/visual-document-retrieval-678151d19d2758f78ce910e1 (with models and datasets)
Demo: llamaindex/multimodal_vdr_demo
Learn more from their blog post here https://huggingface.co/blog/vdr-2b-multilingual šŸ“–
reacted to Tonic's post with šŸš€šŸ”„ 6 days ago
view post
Post
1598
microsoft just released Phi-4 , check it out here : Tonic/Phi-4

hope you like it :-)
reacted to singhsidhukuldeep's post with šŸš€šŸ§  24 days ago
view post
Post
3638
Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
ā€¢ Lightweight Local Encoder that converts bytes to patch representations
ā€¢ Powerful Global Latent Transformer that processes patches
ā€¢ Local Decoder that converts patches back to bytes

>> Technical Advantages
ā€¢ Matches performance of Llama 3 at 8B parameters while being more efficient
ā€¢ Superior handling of non-English languages and rare character sequences
ā€¢ Remarkable 99.9% accuracy on spelling tasks
ā€¢ Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
Ā·