1 49 20

Krinal Joshi

krinal

kjdeveloper8

AI & ML interests

NLP, Speech

Recent Activity

reacted to alibabasglab's post with 👍 about 8 hours ago

We are thrilled to present the improved "ClearerVoice-Studio", an open-source platform designed to make speech processing easy use for everyone! Whether you’re working on speech enhancement, speech separation, speech super-resolution, or target speaker extraction, this unified platform has you covered. ** Why Choose ClearerVoice-Studio?** - Pre-Trained Models: Includes cutting-edge pre-trained models, fine-tuned on extensive, high-quality datasets. No need to start from scratch! - Ease of Use: Designed for seamless integration with your projects, offering a simple yet flexible interface for inference and training. **Where to Find Us?** - GitHub Repository: ClearerVoice-Studio (https://github.com/modelscope/ClearerVoice-Studio) - Try Our Demo: Hugging Face Space (https://huggingface.co/spaces/alibabasglab/ClearVoice) **What Can You Do with ClearerVoice-Studio?** - Enhance noisy speech recordings to achieve crystal-clear quality. - Separate speech from complex audio mixtures with ease. - Transform low-resolution audio into high-resolution audio. A full upscaled LJSpeech-1.1-48kHz dataset can be downloaded from https://huggingface.co/datasets/alibabasglab/LJSpeech-1.1-48kHz . - Extract target speaker voices with precision using audio-visual models. **Join Us in Growing ClearerVoice-Studio!** We believe in the power of open-source collaboration. By starring our GitHub repository and sharing ClearerVoice-Studio with your network, you can help us grow this community-driven platform. **Support us by:** - Starring it on GitHub. - Exploring and contributing to our codebase . - Sharing your feedback and use cases to make the platform even better. - Joining our community discussions to exchange ideas and innovations. - Together, let’s push the boundaries of speech processing! Thank you for your support! :sparkling_heart:

reacted to AdinaY's post with 🔥 about 8 hours ago

MiniCPM-o2.6 🔥 an end-side multimodal LLMs released by OpenBMB from the Chinese community Model: https://huggingface.co/openbmb/MiniCPM-o-2_6 ✨ Real-time English/Chinese conversation, emotion control and ASR/STT ✨ Real-time video/audio understanding ✨ Processes up to 1.8M pixels, leads OCRBench & supports 30+ languages

liked a model about 8 hours ago

openbmb/MiniCPM-o-2_6

View all activity

Organizations

krinal's activity

reacted to alibabasglab's post with 👍 about 8 hours ago

Post

1677

We are thrilled to present the improved "ClearerVoice-Studio", an open-source platform designed to make speech processing easy use for everyone! Whether you’re working on speech enhancement, speech separation, speech super-resolution, or target speaker extraction, this unified platform has you covered.

** Why Choose ClearerVoice-Studio?**

- Pre-Trained Models: Includes cutting-edge pre-trained models, fine-tuned on extensive, high-quality datasets. No need to start from scratch!
- Ease of Use: Designed for seamless integration with your projects, offering a simple yet flexible interface for inference and training.

**Where to Find Us?**

- GitHub Repository: ClearerVoice-Studio (https://github.com/modelscope/ClearerVoice-Studio)
- Try Our Demo: Hugging Face Space ( alibabasglab/ClearVoice)

**What Can You Do with ClearerVoice-Studio?**

- Enhance noisy speech recordings to achieve crystal-clear quality.
- Separate speech from complex audio mixtures with ease.
- Transform low-resolution audio into high-resolution audio. A full upscaled LJSpeech-1.1-48kHz dataset can be downloaded from alibabasglab/LJSpeech-1.1-48kHz .
- Extract target speaker voices with precision using audio-visual models.

**Join Us in Growing ClearerVoice-Studio!**

We believe in the power of open-source collaboration. By starring our GitHub repository and sharing ClearerVoice-Studio with your network, you can help us grow this community-driven platform.

**Support us by:**

- Starring it on GitHub.
- Exploring and contributing to our codebase .
- Sharing your feedback and use cases to make the platform even better.
- Joining our community discussions to exchange ideas and innovations.
- Together, let’s push the boundaries of speech processing! Thank you for your support! :sparkling_heart:

reacted to AdinaY's post with 🔥 about 8 hours ago

Post

1483

MiniCPM-o2.6 🔥 an end-side multimodal LLMs released by OpenBMB from the Chinese community
Model: openbmb/MiniCPM-o-2_6
✨ Real-time English/Chinese conversation, emotion control and ASR/STT
✨ Real-time video/audio understanding
✨ Processes up to 1.8M pixels, leads OCRBench & supports 30+ languages

liked a model about 8 hours ago

openbmb/MiniCPM-o-2_6

Any-to-Any • Updated 39 minutes ago • 1.46k • 303

reacted to davidberenstein1957's post with 👍 about 8 hours ago

Post

1084

🔦 What? The Hub as a vector search backend!

code: https://gist.github.com/davidberenstein1957/f0157a471ec59d9dd44ae6957f1d52ec
build on DuckDB: https://huggingface.co/docs/hub/en/datasets-duckdb

reacted to lamhieu's post with 👍 about 8 hours ago

Post

904

Unlock seamless document conversion with Docsifer, powered by MarkItDown at its core! 🚀 Effortlessly transform PDFs, Word, Excel, images, audio, HTML, and more into clean, structured Markdown—perfect for developers, writers, and content creators. With optional LLM-enhanced extraction and robust format support, Docsifer ensures accuracy, speed, and privacy.
🌟 Try it now and experience professional-grade Markdown conversion: lamhieu/docsifer