lowres (That Time I got Reincarnated as a Hugging Face Organization)

Tonic

posted an update 1 day ago

Post

1427

🙋🏻‍♂️Hey there folks , Open LLM Europe just released Lucie 7B-Instruct model , a billingual instruct model trained on open data ! You can check out my unofficial demo here while we wait for the official inference api from the group : Tonic/Lucie-7B hope you like it 🚀

Tonic

posted an update 7 days ago

Post

1597

microsoft just released Phi-4 , check it out here : Tonic/Phi-4

hope you like it :-)

not-lain

updated a Space 10 days ago

Running

1

🌘W🌖

SigLIP Tagger

s3nh

posted an update 27 days ago

Post

1783

Welcome back,

Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect.
Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3

https://huggingface.co/SmolTuners

3 replies

·

peaceAsh

authored a paper 29 days ago

Maya: An Instruction Finetuned Multilingual Multimodal Model

Paper • 2412.07112 • Published Dec 10, 2024 • 26

eienmojiki

posted an update about 1 month ago

Post

1464

👀 Introducing 2048 Game API: A RESTful API for the Classic Puzzle Game 🧩

I'm excited to share my latest project, 2048 Game API, a RESTful API that allows you to create, manage, and play games of 2048, a popular puzzle game where players slide numbered tiles to combine them and reach the goal of getting a tile with the value of 2048.

⭐ Features
Create new games with customizable board sizes (3-8)
Make moves (up, down, left, right) and get the updated game state
Get the current game state, including the board, score, and game over status
Delete games
Generate images of the game board with customizable themes (light and dark)

🔗 API Endpoints
POST /api/games - Create a new game
GET /api/games/:gameId - Get the current game state
POST /api/games/:gameId/move - Make a move (up, down, left, right)
DELETE /api/games/:gameId - Delete a game
GET /api/games/:gameId/image - Generate an image of the game board

🧩 Example Use Cases
- Create a new game with a 4x4 board:

curl -X POST -H "Content-Type: application/json" -d '{"size": 4}' http://localhost:3000/api/games

- Make a move up:

curl -X POST -H "Content-Type: application/json" -d '{"direction": "up"}' http://localhost:3000/api/games/:gameId/move

- Get the current game state:

curl -X GET http://localhost:3000/api/games/:gameId

💕 Try it out!
- Demo: eienmojiki/2048
- Source: https://github.com/kogakisaki/koga-2048
- You can try out the API by running the server locally or using a tool like Postman to send requests to the API. I hope you enjoy playing 2048 with this API!

Let me know if you have any questions or feedback!

🐧 Mouse1 is our friend🐧

lunarflu

posted an update about 1 month ago

Post

1623

great blogpost! 🔥@wolfram
https://huggingface.co/blog/wolfram/llm-comparison-test-2024-12-04

spedrox-sac

updated a Space about 1 month ago

Sleeping

🦀

Ayanokoji Kiyokata

Boku wa Ayanokoji Kiyokata des.

cappuch

posted an update about 2 months ago

Post

1593

p104-100s are beasts. 8 gigs of VRAM, 12 tok/s on qwen 14b at q4, and 18 tok/s on 7b at q6. best thing - 20 euros each.

https://furry.engineer/@cappuch/113500349547803802

3 replies

·

louisbrulenaudet

posted an update 2 months ago

Post

1829

I’ve published a new dataset to simplify model merging 🤗

This dataset facilitates the search for compatible architectures for model merging with @arcee_ai’s mergekit, streamlining the automation of high-performance merge searches 📖

Dataset : louisbrulenaudet/mergekit-configs

1 reply

·

Tonic

posted an update 2 months ago

Post

3546

🙋🏻‍♂️hey there folks,

periodic reminder : if you are experiencing ⚠️500 errors ⚠️ or ⚠️ abnormal spaces behavior on load or launch ⚠️

we have a thread 👉🏻 https://discord.com/channels/879548962464493619/1295847667515129877

if you can record the problem and share it there , or on the forums in your own post , please dont be shy because i'm not sure but i do think it helps 🤗🤗🤗

2 replies

·

louisbrulenaudet

posted an update 3 months ago

Post

1217

Introducing Lemone-router, a series of classification models designed to produce an optimal multi-agent system for different branches of tax law.

Trained on a base of 49k lines comprising a set of synthetic questions generated by GPT-4 Turbo and Llama 3.1 70B, which have been further refined through evol-instruction tuning and manual curation and authority documents, these models are based on an 8-category decomposition of the classification scheme derived from the Bulletin officiel des finances publiques - impôts :

label2id = {
    "Bénéfices professionnels": 0,
    "Contrôle et contentieux": 1,
    "Dispositifs transversaux": 2,
    "Fiscalité des entreprises": 3,
    "Patrimoine et enregistrement": 4,
    "Revenus particuliers": 5,
    "Revenus patrimoniaux": 6,
    "Taxes sur la consommation": 7
}
	
id2label = {
    0: "Bénéfices professionnels",
    1: "Contrôle et contentieux",
    2: "Dispositifs transversaux",
    3: "Fiscalité des entreprises",
    4: "Patrimoine et enregistrement",
    5: "Revenus particuliers",
    6: "Revenus patrimoniaux",
    7: "Taxes sur la consommation"
}

It achieves the following results on the evaluation set:
- Loss: 0.4734
- Accuracy: 0.9191

Link to the collection: louisbrulenaudet/lemone-router-671cce21d6410f3570514762

Tonic

posted an update 3 months ago

Post

1155

boomers still pick zenodo.org instead of huggingface ??? absolutely clownish nonsense , my random datasets have 30x more downloads and views than front page zenodos ... gonna write a comparison blog , but yeah... cringe.

1 reply

·

Tonic

posted an update 3 months ago

Post

833

🙋🏻‍♂️ hey there folks ,

really enjoying sharing cool genomics and protein datasets on the hub these days , check out our cool new org : https://huggingface.co/seq-to-pheno

scroll down for the datasets, still figuring out how to optimize for discoverability , i do think on that part it will be better than zenodo[dot}org , it would be nice to write a tutorial about that and compare : we already have more downloads than most zenodo datasets from famous researchers !

Tonic

posted an update 3 months ago

Post

1456

hey there folks,

twitter is aweful isnt it ? just getting into the habbit of using hf/posts for shares 🦙🦙

Tonic/on-device-granite-3.0-1b-a400m-instruct

new granite on device instruct model demo , hope you like it 🚀🚀

Tonic

posted an update 3 months ago

Post

995

if you're encountering 500 errors on spaces that seem to work otherwise , kindly consider screenshotting and sharing the link here : https://discord.com/channels/879548962464493619/1295847667515129877

7 replies

·

louisbrulenaudet

posted an update 3 months ago

Post

3119

🚨 I have $3,500 in Azure credits, including access to an H100 (96 Go), expiring on November 12, 2024.

I won’t be able to use it all myself, so I’m reaching out to the @huggingface community: Are there any open-source projets with data ready for some compute power?

Let’s collaborate and make the most of it together 🔗

9 replies

·

Tonic

posted an update 3 months ago

Post

2741

🙋🏻‍♂️hey there folks ,

did you know that https://huggingface.co/lmms-lab released a new version of 🌋🌋Llava on thursday ? Now it has 🎥video understanding !
check it out 👇🏻

collection : lmms-lab/llava-video-661e86f5e8dabc3ff793c944
demo : Tonic/Llava-Video

Tonic

posted an update 3 months ago

Post

1858

🙋🏻‍♂️ Hey there folks ,

🦎Salamandra release by @mvillegas and team
@BSC_CNS https://huggingface.co/BSC-LT is absolutely impressive so far !

perhaps the largest single training dataset of high quality text to date of 7.8 trillion tokens in 35 European languages and code.

the best part : the data was correctly licenced so it's actually future-proof!

the completions model is really creative and instruct fine tuned version is very good also.

now you can use such models for multi-lingual enterprise applications with further finetunes , long response generation, structured outputs (coding) also works.

check out 👇🏻
the collection : BSC-LT/salamandra-66fc171485944df79469043a
the repo : https://github.com/langtech-bsc/salamandra
7B-Instruct demo : Tonic/Salamandra-7B

louisbrulenaudet

posted an update 3 months ago

Post

2113

My biggest release of the year: a series of 7 specialized embedding models for information retrieval within tax documents, is now available for free on Hugging Face 🤗

These new models aim to offer an open source alternative for in-domain semantic search from large text corpora and will improve RAG systems and context addition for large language models.

Trained on more than 43 million tax tokens derived from semi-synthetic and raw-synthetic data, enriched by various methods (in particular MSFT's evol-instruct by @intfloat ), and corrected by humans, this project is the fruit of hundreds of hours of work and is the culmination of a global effort to open up legal technologies that has only just begun.

A big thank you to Microsoft for Startups for giving me access to state-of-the-art infrastructure to train these models, and to @julien-c , @clem 🤗, @thomwolf and the whole HF team for the inference endpoint API and the generous provision of Meta LLama-3.1-70B. Special thanks also to @tomaarsen for his invaluable advice on training embedding models and Loss functions ❤️

Models are available on my personal HF page, into the Lemone-embed collection: louisbrulenaudet/lemone-embed-66fdc24000df732b395df29b

1 reply

·

That Time I got Reincarnated as a Hugging Face Organization

AI & ML interests

Recent Activity

lowres's activity

SigLIP Tagger

Maya: An Instruction Finetuned Multilingual Multimodal Model

Ayanokoji Kiyokata

AI & ML interests

Recent Activity

Team members 192

lowres's activity

SigLIP Tagger

Ayanokoji Kiyokata