Skip to main content
ScriptAI Infrastructurev1.0

Model Quantization (GPTQ)

by AaaS · open-source · Last verified 2026-03-01

Quantizes language models using GPTQ for efficient inference on consumer hardware. Performs calibration-based quantization, quality evaluation against the original model, and exports in formats compatible with vLLM, llama.cpp, and other inference engines.

https://aaas.blog/script/model-quantization-gptq
C+
C+Average
Adoption: C+Quality: AFreshness: B+Citations: C+Engagement: F

Specifications

License
MIT
Pricing
open-source
Capabilities
gptq-quantization, calibration, quality-evaluation, format-export, benchmarking
Integrations
auto-gptq, transformers, datasets, torch
Use Cases
model-compression, edge-deployment, cost-reduction, consumer-gpu-inference
API Available
No
Language
python
Dependencies
auto-gptq, transformers, datasets, torch, safetensors
Environment
Python 3.11+ with CUDA 12 and 16GB+ VRAM
Est. Runtime
30-120 minutes depending on model size
Tags
script, automation, quantization, gptq, optimization
Added
2026-03-17
Completeness
100%

Index Score

52.2
Adoption
58
Quality
80
Freshness
78
Citations
52
Engagement
0

Explore the full AI ecosystem on Agents as a Service