๐ซ...And we're live!๐ซ Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents" https://huggingface.co/blog/ethics-soc-7 Our analyses found: - There's a spectrum of "agent"-ness - *Safety* is a key issue, leading to many other value-based concerns Read for details & what to do next! With @evijit , @giadap , and @sasha
๐ค๐ค ๐ป Speaking of AI agents ... ...Is easier with the right words ;)
My colleagues @meg@evijit@sasha and @giadap just published a wonderful blog post outlining some of the main relevant notions with their signature blend of value-informed and risk-benefits contrasting approach. Go have a read!
The paper has a lot of experiments (they trained 84 models!) about what makes the video LMs work โฏ๏ธ
Try the demo for best setup here https://huggingface.co/spaces/Apollo-LMMs/Apollo-3B they evaluate sampling strategies, scaling laws for models and datasets, video representation and more! > The authors find out that whatever design decision was applied to small models also scale properly when the model and dataset are scaled ๐ scaling dataset has diminishing returns for smaller models > They evaluate frame sampling strategies, and find that FPS sampling is better than uniform sampling, and they find 8-32 tokens per frame optimal > They also compare image encoders, they try a variation of models from shape optimized SigLIP to DINOv2 they find google/siglip-so400m-patch14-384 to be most powerful ๐ฅ > they also compare freezing different parts of models, training all stages with some frozen parts give the best yield
They eventually release three models, where Apollo-3B outperforms most 7B models and Apollo 7B outperforms 30B models ๐ฅ
Did a fun experiment: What are the main themes emerging from the 100+ Nieman Journalism Lab predictions for 2025?
I used natural language processing to cluster and map them โ really helps spot patterns that weren't obvious when reading predictions one by one. So what will shape journalism next year? A lot of AI and US politics (surprise!), but there's also this horizontal axis that spans from industry strategies to deep reflections on how to talk to the public.
Click any dot to explore the original prediction. What themes surprise/interest you the most?
๐ช๐บ Policy Thoughts in the EU AI Act Implementation ๐ช๐บ
There is a lot to like in the first draft of the EU GPAI Code of Practice, especially as regards transparency requirements. The Systemic Risks part, on the other hand, is concerning for both smaller developers and for external stakeholders.
I wrote more on this topic ahead of the next draft. TLDR: more attention to immediate large-scale risks and to collaborative solutions supported by evidence can help everyone - as long as developers disclose sufficient information about their design choices and deployment contexts.
๐ Announcing Global-MMLU: an improved MMLU Open dataset with evaluation coverage across 42 languages, built with Argilla and the Hugging Face community.
Global-MMLU is the result of months of work with the goal of advancing Multilingual LLM evaluation. It's been an amazing open science effort with collaborators from Cohere For AI, Mila - Quebec Artificial Intelligence Institute, EPFL, Massachusetts Institute of Technology, AI Singapore, National University of Singapore, KAIST, Instituto Superior Tรฉcnico, Carnegie Mellon University, CONICET, and University of Buenos Aires.
๐ท๏ธ +200 contributors used Argilla MMLU questions where regional, dialect, or cultural knowledge was required to answer correctly. 85% of the questions required Western-centric knowledge!
Thanks to this annotation process, the open dataset contains two subsets:
1. ๐ฝ Culturally Agnostic: no specific regional, cultural knowledge is required. 2. โ๏ธ Culturally Sensitive: requires dialect, cultural knowledge or geographic knowledge to answer correctly.
Moreover, we provide high quality translations of 25 out of 42 languages, thanks again to the community and professional annotators leveraging Argilla on the Hub.
I hope this will ensure a better understanding of the limitations and challenges for making open AI useful for many languages.
๐๐ Just dropped: visualization mapping Hugging Face's most liked & downloaded models from 2022 to now. Small models are clearly on the rise - fascinating shift in both likes and download patterns.
The cleaning process consists of: - Joining the separate splits together / add split column - Converting string messages into list of structs - Removing empty system prompts
Fascinating point from @thomwolf at Web Summit: AI misuse (deepfakes, fake news) is actually easier to make with closed models, not with open-source ones.
This challenges the common narrative that open-source AI is inherently more dangerous. The reality is more nuanced - while we may think open source is technically easier to misuse, closed models' accessibility and product-focused design appear to be driving more actual harm.
Important context for current AI safety discussions and regulation debates.
Anthropic publishes the โsystem promptsโ that make Claude tick - "In its continued effort to paint itself as a more ethical, transparent AI vendor, Anthropic has published the system prompts for its latest models" - They specify that โClaude cannot open URLs, links, or videos, perform facial recognition or identify or name any humans in photos. - "Anthropic is exerting pressure on competitors to publish the same. Weโll have to see if the gambit works." https://techcrunch.com/2024/08/26/anthropic-publishes-the-system-prompt-that-makes-claude-tick/
Chinaโs tech giants splash out on AI despite US restrictions (paywall) - "Alibaba, Tencent and Baidu had combined capital expenditure of Rmb50bn ($7bn) in the first half, compared with Rmb23bn a year earlier. TikTok parent ByteDance (which is private) has also increased AI-related spending" - Nvidia's H100 and upcoming Blackwell series are under US restrictions, but Chinaโs tech giants can buy H20 - Analysts expect Nvidia to ship more than 1mn of the processors to Chinese tech groups in the coming months. https://www.ft.com/content/31bffc48-2ca7-472b-9d53-3deaad2d86ce
MZ "said it was improper for the Biden administration to have pressured Facebook to censor content in 2021 related to the coronavirus pandemic" - "At the time, Facebookโs publicly stated goal was to push millions of people toward Covid-19 vaccines. In his letter, Zuckerberg didnโt indicate whether he had changed his mind about that goal" https://www.wsj.com/tech/mark-zuckerberg-neutral-politics-letter-election-2024-02b86372
Just crossed 200,000 free public AI datasets shared by the community on Hugging Face! Text, image, video, audio, time-series & many more... Thanks everyone!
๐AI math olympiad winner NuminaMath is here! ๐คAnnouncing New Hugging Face and Keras NLP integration โจUI overhaul to HF tokens! ๐ง Embed our dataset viewer on any webpage!
Small models, BIG impact: SmolLM is here! ๐๐ฌ
We're launching a series of small but mighty language models: ๐๏ธ Super fast - runs on laptops, phones, you name it! ๐ 3 sizes: 130M, 350M, and 1.5B parameters ๐ฅ Outperforms same size models from Meta, Microsoft, and Qwen ๐ Fully open-source: datasets, training code, models
๐๐๐ฒ ๐๐๐๐ญ๐ฎ๐ซ๐๐ฌ - Trained on FineWeb-Edu and Cosmopedia v2 (largest synthetic pre-training dataset) - No cloud needed - run locally for privacy and energy efficiency - Everything is public, from data curation to training steps
๐๐จ๐ญ๐๐ง๐ญ๐ข๐๐ฅ ๐ฎ๐ฌ๐ ๐๐๐ฌ๐๐ฌ - On-device autocomplete - Local request parsing - Custom fine-tuning for specific needs without the need for expensive GPUs