Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

8-bit precision

Mixture of Experts

Misc with no match

4-bit precision

text-embeddings-inference

Carbon Emissions

Models

336

Full-text search

Active filters: fp8

comaniac/Meta-Llama-3-8B-Instruct-FP8-v1

Text Generation • Updated May 24, 2024 • 19

comaniac/Mixtral-8x22B-Instruct-v0.1-FP8-v1

Text Generation • Updated May 28, 2024 • 21

neuralmagic/Meta-Llama-3-70B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 37.6k • 11

comaniac/Meta-Llama-3-70B-Instruct-FP8-v1

Text Generation • Updated May 26, 2024 • 18

comaniac/Mixtral-8x7B-Instruct-v0.1-FP8-v1

Text Generation • Updated May 26, 2024 • 20

comaniac/Mixtral-8x7B-Instruct-v0.1-FP8-v2

Text Generation • Updated Jun 10, 2024 • 23

Skywork/Skywork-MoE-Base-FP8

Text Generation • Updated Jul 31, 2024 • 24 • 6

comaniac/Meta-Llama-3-70B-Instruct-FP8-v2

Text Generation • Updated Jun 10, 2024 • 21

comaniac/Mixtral-8x7B-Instruct-v0.1-FP8-v3

Text Generation • Updated Jun 10, 2024 • 19

comaniac/Mixtral-8x22B-Instruct-v0.1-FP8-v2

Text Generation • Updated Jun 10, 2024 • 23

neuralmagic/Mixtral-8x22B-Instruct-v0.1-FP8

Text Generation • Updated Aug 12, 2024 • 415 • 2

nm-testing/granite-20b-code-base-FP8

Text Generation • Updated Jun 12, 2024 • 27

nm-testing/granite-3b-code-base-FP8

Text Generation • Updated Jun 12, 2024 • 19

fr00000/dolp-fp8

Text Generation • Updated Jun 13, 2024 • 21

neuralmagic/Qwen2-0.5B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 1.61k • 2

nm-testing/opt-125m-fp8-static-kv

Text Generation • Updated Jun 14, 2024 • 23

neuralmagic/Qwen2-1.5B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 63

neuralmagic/Qwen2-7B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 763 • 1

anyisalin/L3-70B-Euryale-v2.1-FP8

Text Generation • Updated Jun 18, 2024 • 168

nm-testing/Qwen2-0.5B-Instruct-FP8-KV

Text Generation • Updated Jun 18, 2024 • 20

yentinglin/Llama-3-Taiwan-70B-Instruct-FP8

Text Generation • Updated Jun 20, 2024 • 13

kuotient/llama3-instrucTrans-enko-8b-FP8

Text Generation • Updated Jun 20, 2024 • 18 • 2

nm-testing/SparseLlama-3-8B-pruned_50.2of4-FP8

Text Generation • Updated Jun 25, 2024 • 27

FlorianJc/Hermes-2-Pro-Mistral-7B-vllm-fp8

Text Generation • Updated Jul 17, 2024 • 23

FlorianJc/openchat-3.6-8b-20240522-vllm-fp8

Text Generation • Updated Jul 17, 2024 • 29

FlorianJc/Llama3-ChatQA-1.5-8B-vllm-fp8

Text Generation • Updated Jul 17, 2024 • 20

TechxGenus/Codestral-22B-v0.1-FP8

Text Generation • Updated Jun 21, 2024 • 159

Model-SafeTensors/Meta-Llama-3-70B-FP8-Dynamic

Text Generation • Updated Jun 23, 2024 • 8.62k

Model-SafeTensors/Qwen-Qwen2-72B-FP8-Dynamic

Text Generation • Updated Sep 4, 2024 • 11.6k

Rallio67/magnum-72B-FP8

Text Generation • Updated Jun 26, 2024 • 20