Edge Model Optimization
by Community · free · Last verified 2026-03-17
Optimizes PyTorch and TensorFlow models for edge hardware by applying INT8/FP16 quantization and converting them to ONNX or TFLite formats. This script provides platform-specific tuning for ARM and NPU targets, benchmarking latency and memory usage while generating a report on accuracy trade-offs.
https://github.com/NVIDIA/TensorRT ↗C+
C+—Average
Adoption: BQuality: AFreshness: ACitations: C+Engagement: F
Specifications
- License
- Apache-2.0
- Pricing
- free
- Capabilities
- INT8 post-training quantization, FP16 quantization, ONNX model export and validation, TensorFlow Lite (TFLite) conversion, Hardware-specific tuning for ARM and NPU targets, Latency and memory footprint benchmarking, Model accuracy degradation analysis, Automated deployment report generation
- Integrations
- PyTorch, TensorFlow, ONNX, ONNX Runtime, TensorFlow Lite
- Use Cases
- [object Object], [object Object], [object Object], [object Object], [object Object]
- API Available
- No
- Language
- python
- Dependencies
- torch, onnx, onnxruntime, tensorflow-lite-runtime, numpy
- Environment
- Python 3.10+
- Est. Runtime
- 5-30 minutes per model
- Tags
- edge-deployment, onnx, quantization, tflite, model-compression, model-optimization, embedded-ml, tinyml, pytorch, tensorflow, arm-processors, npu
- Added
- 2026-03-17
- Completeness
- 0.9%
Index Score
58.7Adoption
68
Quality
85
Freshness
82
Citations
58
Engagement
0