neuralmagic
/

Meta-Llama-3-8B-Instruct-quantized.w4a16

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Meta-Llama-3-8B-Instruct-quantized.w4a16

2 contributors

History: 7 commits

abhinavnmagic's picture

Upload tokenizer_config.json with huggingface_hub

4dd86b8 verified 2 months ago

.gitattributes

1.52 kB

initial commit 2 months ago
config.json

1.05 kB

Upload config.json with huggingface_hub 2 months ago
model.safetensors
5.73 GB
LFS

Upload model.safetensors with huggingface_hub 2 months ago
quantize_config.json

269 Bytes

Upload quantize_config.json with huggingface_hub 2 months ago
special_tokens_map.json

296 Bytes

Upload special_tokens_map.json with huggingface_hub 2 months ago
tokenizer.json

9.09 MB

Upload tokenizer.json with huggingface_hub 2 months ago
tokenizer_config.json

51 kB

Upload tokenizer_config.json with huggingface_hub 2 months ago