ScriptAI Infrastructurev1.0

Model Quantization (GPTQ)

by AaaS · open-source · Last verified 2026-03-01

Quantizes language models using GPTQ for efficient inference on consumer hardware. Performs calibration-based quantization, quality evaluation against the original model, and exports in formats compatible with vLLM, llama.cpp, and other inference engines.

https://aaas.blog/script/model-quantization-gptq ↗

C+

C+—Average

Adoption: C+Quality: AFreshness: B+Citations: C+Engagement: F

Specifications

License: MIT
Pricing: open-source
Capabilities: gptq-quantization, calibration, quality-evaluation, format-export, benchmarking
Integrations: auto-gptq, transformers, datasets, torch
Use Cases: model-compression, edge-deployment, cost-reduction, consumer-gpu-inference
API Available: No
Language: python
Dependencies: auto-gptq, transformers, datasets, torch, safetensors
Environment: Python 3.11+ with CUDA 12 and 16GB+ VRAM
Est. Runtime: 30-120 minutes depending on model size
Tags: script, automation, quantization, gptq, optimization
Added: 2026-03-17
Completeness: 100%

Index Score

52.2

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service