✨ MiniMax-text-01: - 456B with 45.9B activated per token - Combines Lightning Attention, Softmax Attention, and MoE for optimal performance - Training context up to 1M tokens, inference handles 4M tokens
✨ MiniMax-VL-01: - ViT-MLP-LLM framework ( non-transformer👀) - Handles image inputs from 336×336 to 2016×2016 - 694M image-caption pairs + 512B tokens processed across 4 stages
MiniCPM-o2.6 🔥 an end-side multimodal LLMs released by OpenBMB from the Chinese community Model: openbmb/MiniCPM-o-2_6 ✨ Real-time English/Chinese conversation, emotion control and ASR/STT ✨ Real-time video/audio understanding ✨ Processes up to 1.8M pixels, leads OCRBench & supports 30+ languages
🎯The space handles documenting content from the input image along with standardized plain text. It includes adjustment tools with over 30 font styles, file formatting support for PDF and DOCX, textual alignments, font size adjustments, and line spacing modifications.
📄PDFs are rendered using the ReportLab software library toolkit.
🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
The Hugging Face Download Tool is a sophisticated graphical user interface application designed to simplify the process of downloading resources from Hugging Face repositories. This tool addresses common challenges in model and file downloads through its intelligent features and user-friendly interface.
✨ Key Features - 🖥️ Intuitive graphical interface for easy operation - 🔄 Advanced retry mechanism with smart error handling - ⏸️ Resume capability for interrupted downloads - 📊 Real-time download status monitoring - 🔐 Secure access to private repositories via token authentication
🛠️ Technical Highlights The tool implements several advanced features to ensure reliable downloads: - 📦 Chunk-based downloading with 1MB segments - ⚡ Adaptive retry intervals (5-300 seconds) based on error types - 🔌 Connection pooling for optimized performance - 🛡️ Built-in rate limiting protection - 🔑 Secure token handling for private repository access
This tool is ideal for researchers, developers, and AI practitioners who regularly work with Hugging Face resources and need a reliable, user-friendly download solution. 💻 It supports all major operating systems and requires minimal setup, making it accessible to users of all technical levels. 🚀
🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.
QvQ-72B-Preview🎄 an open weight model for visual reasoning just released by Alibaba_Qwen team Qwen/qvq-676448c820912236342b9888 ✨ Combines visual understanding & language reasoning. ✨ Scores 70.3 on MMMU ✨ Outperforms Qwen2-VL-72B-Instruct in complex problem-solving
Megrez-3B-Omni 🔥 an on-device multimodal LLM by Infinigence AI, another startup emerging from the Tsinghua University ecosystem. Model: Infinigence/Megrez-3B-Omni Demo: Infinigence/Megrez-3B-Omni ✨Supports analysis of image, text, and audio modalities ✨Leads in bilingual speech ( English & Chinese ) input, multi-turn conversations, and voice-based queries ✨Outperforms in scene understanding and OCR across major benchmarks
reacted to di-zhang-fdu's
post with 🔥about 1 month ago
LLaMA-O1-PRM and LLaMA-O1-Reinforcement will release in this weekend. We have implemented a novel Reinforcement finetune(RFT) pipeline that taught models learning reasoning and reward labeling without human annotation.