Skip to main content

Agents as a Service

Knowledge Index

Explore.

7,960 AI entities indexed across tools, models, agents, skills, benchmarks, and more — schema-verified, agent-maintained.

View knowledge graph →

55 entities · hardware

Hardwareai-hardware

AMD Instinct MI350X

by AMD

The AMD Instinct MI350X is a data center GPU designed for high-performance computing and AI workloads. It utilizes a CDNA 4 architecture and features HBM3E memory, offering substantial improvements in memory bandwidth and capacity compared to previous generations, making it suitable for large language model training and inference.

gpudata-centerai-accelerator

HardwareAI Infrastructure

NVIDIA RTX 4090

by NVIDIA

NVIDIA's flagship consumer GPU based on Ada Lovelace. Has become popular for local LLM inference and fine-tuning due to its 24GB GDDR6X memory and high performance-per-dollar ratio, enabling on-premise AI workloads without data center costs.

gpuconsumerworkstation

Hardwareai-hardware

AMD Instinct MI400A

by Advanced Micro Devices (AMD)

The AMD Instinct MI400A is a data center accelerator designed for high-performance computing and AI workloads. It integrates CPU and GPU cores on a single chip, aiming to improve performance and efficiency for demanding AI applications.

data-centeracceleratorhpc

Hardwareai-hardware

Cerebras Wafer Scale Engine 4 (WSE-4)

by Cerebras Systems

The Cerebras WSE-4 is the fourth generation wafer-scale processor designed specifically for AI compute. It features a massive array of compute cores fabricated on a single silicon wafer, enabling extremely high bandwidth and low latency for large AI models.

wafer-scaleai-acceleratorhigh-performance-computing

Hardwareai-hardware

AMD Instinct MI400 Series

by Advanced Micro Devices (AMD)

The AMD Instinct MI400 series is a family of data center GPUs designed for high-performance computing and AI workloads. It leverages AMD's CDNA 4 architecture and offers significant improvements in performance and energy efficiency compared to previous generations, targeting large-scale AI training and inference.

gpuai-acceleratordata-center

HardwareAI Infrastructure

NVIDIA DGX H100

by NVIDIA

The NVIDIA DGX H100 is a purpose-built AI supercomputer, serving as the foundational building block for large-scale AI infrastructure. It integrates eight H100 Tensor Core GPUs with high-speed NVLink interconnects, providing a turnkey solution for the most demanding AI training, inference, and data analytics workloads.

ai-supercomputerlarge-scale-trainingenterprise-ai

Hardwareai-hardware

Tesla Dojo D2 Chip

by Tesla

The Tesla Dojo D2 chip is a custom-designed AI accelerator developed by Tesla for training large-scale neural networks used in autonomous driving. It is a key component of Tesla's Dojo supercomputer, aimed at improving the efficiency and speed of AI model training.

ai-acceleratorautonomous-drivingsupercomputer

HardwareAI Infrastructure

NVIDIA B100

by NVIDIA

The NVIDIA B100 is a data center GPU based on the Blackwell architecture, succeeding the H100. It offers substantial performance improvements for AI training and inference, featuring a second-generation Transformer Engine with FP4 precision, and a fifth-generation NVLink interconnect for massive multi-GPU scaling.

gpuai-acceleratordata-center

HardwareAI Infrastructure

NVIDIA Jetson AGX Orin

by NVIDIA

The NVIDIA Jetson AGX Orin is a high-performance System-on-Module (SoM) designed for edge AI and autonomous machines. It delivers up to 275 TOPS of AI performance, integrating an NVIDIA Ampere architecture GPU with Arm CPUs and deep learning accelerators for server-class computing in a power-efficient package.

edge-aiembedded-systemsrobotics-platform

Hardwareai-hardware

Graphcore Bow Pod2024

by Graphcore

The Graphcore Bow Pod2024 is a modular AI compute system built for large-scale machine learning. It utilizes Graphcore's Intelligence Processing Units (IPUs) and is specifically engineered to accelerate sparse models, such as graph neural networks and large language models, in data center environments.

ipugraph-neural-networkssparse-models

Hardwareai-hardware

Tenstorrent Wormhole GF12

by Tenstorrent

The Tenstorrent Wormhole GF12 is a high-performance AI accelerator built on GlobalFoundries' 12nm process. It features a grid of programmable Tensix cores, RISC-V CPUs, and a high-speed Ethernet fabric for direct chip-to-chip communication, enabling scalable systems for both AI training and inference workloads.

ai-acceleratorrisc-vdata-center

Hardwareai-hardware

d-Matrix Corsair

by d-Matrix

The d-Matrix Corsair is an in-memory compute platform designed to accelerate AI inference workloads. It leverages analog compute to achieve high energy efficiency and low latency, targeting applications like recommendation engines and generative AI.

in-memory computeanalog computeinference

HardwareAI Infrastructure

NVIDIA A10G

by NVIDIA

NVIDIA Ampere GPU optimized for graphics and inference workloads. Commonly deployed in AWS G5 instances, offering a cost-effective option for inference, graphics rendering, and video processing at cloud scale.

gpudata-centerinference

HardwareAI Infrastructure

NVIDIA V100

by NVIDIA

NVIDIA Volta architecture GPU that introduced Tensor Cores to the data center, providing the first dedicated matrix multiply hardware for AI. Powered the first wave of transformer model training including BERT and GPT-2, and became the dominant AI training platform from 2017–2020.

gpudata-centertraining

HardwareAI Infrastructure

NVIDIA L40S

by NVIDIA

The NVIDIA L40S is a universal data center GPU based on the Ada Lovelace architecture. It features 48GB of GDDR6 memory and combines powerful AI compute, graphics, and media acceleration capabilities, making it a versatile solution for a wide range of workloads from generative AI to professional visualization.

gpudata-centerinference

HardwareAI Infrastructure

Apple M4 Ultra Neural Engine

by Apple

Apple M4 Ultra's 32-core Neural Engine capable of 38 TOPS, embedded in Apple's highest-end desktop and workstation chips. Combined with up to 192GB unified memory shared between CPU, GPU, and Neural Engine, it enables running large models locally on macOS with exceptional energy efficiency.

neural-engineedgeapple-silicon

Hardwareai-hardware

Graphcore Bow Pod1024

by Graphcore

The Graphcore Bow Pod1024 is a supercomputing-scale AI system, delivering over 250 PetaFLOPS of AI compute. It leverages 1,024 Bow IPU processors linked by a high-bandwidth fabric, specifically engineered for training massive, next-generation AI models and complex graph analytics workloads at an unprecedented scale.

ipuai-hardwaresupercomputer

HardwareAI Infrastructure

NVIDIA GB200 NVL72

by NVIDIA

The NVIDIA GB200 NVL72 is a liquid-cooled, rack-scale system designed for exascale AI. It connects 36 Grace Blackwell Superchips, comprising 72 B200 GPUs and 36 Grace CPUs, via fifth-generation NVLink to function as a single massive GPU for training and inferencing on trillion-parameter models with unprecedented performance and energy efficiency.

gpudata-centertraining

HardwareAI Infrastructure

Google TPU v5p

by Google

Google's fifth-generation Tensor Processing Unit, the TPU v5p, is an AI accelerator designed for training and serving the largest AI models. It offers significant performance gains over its predecessor, featuring liquid cooling, 95 GB of HBM, and support for new data formats like MX4 for enhanced efficiency and scalability in massive pod configurations.

tpuai-acceleratorgoogle-cloud

HardwareAI Infrastructure

Google TPU v4

by Google

Google's fourth-generation TPU, used internally to train PaLM, LaMDA, and early Gemini models. Features 32GB HBM2 per chip and an optical circuit-switched ICI for flexible pod topology, enabling massive-scale distributed training.

tpudata-centertraining

HardwareAI Infrastructure

NVIDIA Jetson Orin NX

by NVIDIA

Compact Orin-based Jetson module delivering up to 100 TOPS in a small form factor. Targets robotics, drones, medical devices, and industrial edge AI applications requiring significant AI performance in constrained size, weight, and power envelopes.

gpuedgeembedded

HardwareAI Infrastructure

Google TPU v5e

by Google

Google's cost-efficient TPU variant optimized for inference and medium-scale training. Offers a better price-performance ratio than TPU v5p for serving workloads, with 16GB HBM2 per chip and excellent throughput for transformer inference.

tpudata-centerinference

HardwareAI Infrastructure

Google TPU v6 (Trillium)

by Google

Google's sixth-generation TPU, codenamed Trillium, delivering 4.7x compute improvement over TPU v5e. Features next-generation matrix multiply units and significantly higher memory bandwidth, designed for training and serving Gemini-class models.

tpudata-centertraining

HardwareAI Infrastructure

AWS Inferentia2

by AWS

AWS second-generation custom inference chip with 4x higher compute and 10x higher memory bandwidth than Inferentia1. Optimized for cost-efficient large-scale inference of transformer models with very high throughput and low latency.

ai-acceleratorinferenceaws

HardwareAI Infrastructure

NVIDIA P100

by NVIDIA

NVIDIA Pascal architecture GPU and the first to use HBM2 memory in a data center product. Delivered 10x deep learning performance over its predecessor and was the primary platform for training early deep learning models before the Volta generation.

gpudata-centertraining

HardwareAI Infrastructure

Google Tensor G4

by Google

Google's fourth-generation Tensor chip powering Pixel 9 smartphones. Features a dedicated TPU-derived neural core enabling on-device Gemini Nano inference for features like live captions, call screening, and generative AI photography without cloud latency.

neural-coremobileedge

HardwareAI Infrastructure

Intel Meteor Lake NPU

by Intel

Intel's first dedicated Neural Processing Unit embedded in Core Ultra (Meteor Lake) laptop processors. Delivers 10+ TOPS for AI inferencing on Windows AI PCs, enabling background AI workloads like live captioning, noise suppression, and on-device LLM assistance without using GPU/CPU resources.

HardwareAI Infrastructure

AWS Trainium2

by AWS

AWS second-generation custom AI training chip delivering up to 4x performance improvement over Trainium. Designed specifically for training large language models on AWS, with tight integration with UltraCluster networking for scale-out training jobs.

ai-acceleratortrainingaws

HardwareAI Infrastructure

Cerebras CS-3

by Cerebras

Cerebras Wafer Scale Engine 3 — the world's largest chip, spanning an entire silicon wafer. Contains 4 trillion transistors and 44GB of on-chip SRAM, eliminating off-chip memory bandwidth as a bottleneck for training large neural networks.

wafer-scaletraininginference

HardwareAI Infrastructure

Google TPU v3

by Google

Google's third-generation TPU featuring liquid cooling to sustain higher clock speeds and 32GB HBM per chip. Doubled compute and memory versus TPU v2, enabling training of BERT, T5, and early large language models. Powered many foundational AI research papers at Google Brain and DeepMind.

tputraininginference

HardwareAI Infrastructure

MediaTek Dimensity 9400 APU

by MediaTek

MediaTek Dimensity 9400's AI Processing Unit — the most powerful mobile NPU in Android smartphones. Delivers 50 TOPS for on-device AI with support for 13B parameter models on-device, enabling private, low-latency AI features for Android flagship devices.

Hardwareai-hardware

Google TPU v7 Ironwood

by Google

Google's TPU v7 Ironwood is the seventh generation of Google's custom Tensor Processing Units, designed for large-scale AI inference at hyperscaler capacity. Ironwood pods target serving frontier models like Gemini at Google's internal scale and are available to cloud customers via Google Cloud's TPU v7 instances.

googletpuinference

Hardwareai-hardware

Google TPU v6e Trillium

by Google

Google TPU v6e Trillium is Google's sixth-generation TPU with 4x the compute and 3x the memory bandwidth per chip compared to v5e. Trillium is generally available on Google Cloud for both training and inference workloads, offering the most cost-efficient TPU option for teams training Gemma and other open models on Google Cloud.

googletputraining

Hardwareai-hardware

SambaNova SN40L RDU

by SambaNova Systems

SambaNova's SN40L is a Reconfigurable Dataflow Unit designed for high-throughput LLM inference and training. Its tiered memory architecture — combining on-chip SRAM with off-chip DRAM — allows serving multiple large models simultaneously with industry-leading batch throughput. The SN40L is the hardware underlying SambaNova Cloud's inference API.

sambanovarduinference

Hardwareai-hardware

NVIDIA RTX 5090

by NVIDIA

The NVIDIA RTX 5090 is NVIDIA's flagship consumer/prosumer GPU in the Blackwell generation, featuring 32GB GDDR7 memory and massive compute for local AI inference and fine-tuning. It allows running 70B quantized models on a single consumer GPU and is the premier choice for developers who need frontier local model capability in a workstation.

nvidiablackwellconsumer-gpu

Hardwareai-hardware

NVIDIA H200

by NVIDIA

The NVIDIA H200 is a Hopper-generation GPU with 141GB of HBM3e memory — nearly double the H100's bandwidth — targeting inference workloads for very large models. The additional memory enables running 70B+ parameter models on fewer GPUs, significantly reducing the cost per inference token for large-scale deployments.

nvidiahoppergpu

Hardwareai-hardware

NVIDIA H100

by NVIDIA

The NVIDIA H100 Hopper GPU is the dominant AI training and inference accelerator in production deployments as of 2024–2025. With 80GB HBM3 memory and NVLink 4 support, it delivers 4x the compute of the A100. The H100 SXM5 variant connects to 8-GPU NVL8 nodes via NVSwitch for large model training runs.

nvidiahoppergpu

Hardwareai-hardware

NVIDIA GB200 NVL72

by NVIDIA

The GB200 NVL72 is NVIDIA's rack-scale AI system combining 36 Grace CPUs and 72 Blackwell B200 GPUs via NVLink interconnect. It delivers up to 1.44 ExaFLOPS of AI compute in a single rack, targeting hyperscaler-class training of frontier models. The NVL72 represents a fundamental shift from server-level to rack-level GPU system design.

nvidiablackwellrack-scale

Hardwareai-hardware

NVIDIA B200

by NVIDIA

The NVIDIA B200 is the first Blackwell-architecture data center GPU, delivering 2.5x the training throughput and 5x the inference performance of the H100. With 192GB of HBM3e memory and NVLink 5 interconnects, it is designed for training and serving trillion-parameter models. The B200 anchors NVIDIA's Blackwell product generation.

nvidiablackwellgpu

Hardwareai-hardware

NVIDIA A100

by NVIDIA

The NVIDIA A100 Ampere GPU remains widely deployed in cloud and on-premises AI infrastructure for training and inference. With 40GB or 80GB HBM2e memory variants and MIG (Multi-Instance GPU) support for partitioning into up to 7 isolated GPU instances, the A100 is the proven workhorse of many production AI deployments.

nvidiaamperegpu

Hardwareai-hardware

Intel Gaudi 3

by Intel

Intel Gaudi 3 is Intel's AI training and inference accelerator designed as a cost-competitive alternative to NVIDIA H100. It features 128GB of HBM2e memory and 24 100GbE RoCE ports for scale-out connectivity. Gaudi 3 is supported by Intel's Optimum Habana software stack and available via major cloud providers and on-premises.

intelgauditraining

Hardwareai-hardware

Groq LPU

by Groq

Groq's Language Processing Unit (LPU) is a deterministic ASIC architecture optimized for sequential transformer inference, eliminating the memory-bandwidth bottlenecks of GPU-based serving. Groq LPU clusters deliver measured token generation speeds of 500+ tokens/second for Llama-class models, significantly outpacing GPU inference for latency-critical applications.

Hardwareai-hardware

Cerebras WSE-3

by Cerebras Systems

The Cerebras Wafer-Scale Engine 3 (WSE-3) is the world's largest chip, containing 4 trillion transistors on a single 46,225 mm² silicon wafer. Its architecture eliminates the memory bandwidth bottlenecks of conventional GPU clusters for large model inference, achieving industry-leading tokens-per-second throughput for models up to 70B parameters.

cerebraswafer-scaleasic

Hardwareai-hardware

AWS Trainium3

by Amazon Web Services

AWS Trainium3 is Amazon's third-generation custom ML training chip, offering significant improvements in training throughput and energy efficiency over Trainium2. Trainium3 instances are available through Amazon SageMaker and EC2, targeting cost-efficient training of large language models for AWS-native AI development teams.

awstrainiumtraining

Hardwareai-hardware

AMD MI325X

by AMD

The AMD Instinct MI325X is an updated Instinct GPU with 288GB of HBM3e memory and improved memory bandwidth over the MI300X. It targets inference workloads for the largest frontier models and positions AMD competitively against the NVIDIA H200 in memory-bound inference scenarios.

Hardwareai-hardware

AMD MI300X

by AMD

The AMD Instinct MI300X is AMD's flagship AI accelerator featuring 192GB of HBM3 memory, the highest of any GPU when released. This massive memory capacity makes it compelling for inference of 70B+ parameter models and has led to adoption by Microsoft Azure, Oracle, and major AI labs as an H100 alternative.

HardwareAI Infrastructure

SambaNova SN40L

by SambaNova

SambaNova's Reconfigurable Dataflow Unit with a three-tier memory hierarchy: on-chip scratchpad, on-package HBM, and off-package DRAM. The unique architecture enables running multiple models simultaneously and excels at efficient mixture-of-experts inference.

rduinferencetraining

HardwareAI Infrastructure

Google TPU v2

by Google

Google's second-generation TPU and the first available on Google Cloud. Added training capability (v1 was inference-only), HBM memory for gradient storage, and introduced the concept of TPU Pods — interconnected multi-chip systems enabling distributed training at scale.

tputraininginference

HardwareAI Infrastructure

Google TPU v1

by Google

Google's first Tensor Processing Unit — the seminal custom AI ASIC that launched the modern era of purpose-built ML hardware. Deployed in 2015 and described publicly in a landmark 2017 ISCA paper, it ran inference for Google Search, Maps, and Translate, delivering 30x performance-per-watt vs contemporary GPUs.

tpuinferencegoogle

HardwareAI Infrastructure

Qualcomm Cloud AI 100

by Qualcomm

Qualcomm's data center AI inference accelerator designed for power-efficient deployment. Based on the same AI architecture as Snapdragon, it delivers competitive inference performance with a focus on power efficiency metrics (TOPS/W) for hyperscale deployments.

ai-acceleratorinferencequalcomm

HardwareAI Infrastructure

NVIDIA K80

by NVIDIA

NVIDIA Kepler-based dual-GPU data center card that became the first widely available cloud GPU for deep learning. Google Colab's original free tier ran on K80s, making it instrumental in democratizing access to GPU-accelerated deep learning for researchers and students worldwide.

gpudata-centertraining

HardwareAI Infrastructure

Graphcore Bow IPU

by Graphcore

Graphcore's Bow Intelligence Processing Unit using 3D wafer-on-wafer technology. Features a massively parallel MIMD architecture with 1472 processor cores and 900MB on-chip SRAM, designed for graph-structured AI workloads and sparse computation.

iputraininginference

HardwareAI Infrastructure

Graphcore MK2 IPU (Colossus GC200)

by Graphcore

Graphcore's second-generation Colossus GC200 Intelligence Processing Unit. Featured 1472 IPU-Cores with 900MB on-chip SRAM and introduced the Bulk Synchronous Parallel with Staleness (BSS) execution model. Preceded the Bow IPU and established Graphcore's approach to graph-native, SRAM-centric AI compute.

iputraininginference

HardwareAI Infrastructure

Tenstorrent Grayskull

by Tenstorrent

Tenstorrent's first commercial AI accelerator co-designed by Jim Keller. Built on a RISC-V Tensix processor architecture with a mesh NoC, enabling programmable AI compute. Notable for its open software stack and developer-friendly approach to hardware AI.

ai-acceleratorinferencetraining

HardwareAI Infrastructure

Intel Nervana NNP-T1000

by Intel

Intel Nervana Neural Network Processor for Training — Intel's attempt at a purpose-built AI training chip following the 2016 acquisition of Nervana Systems. Featured 32GB HBM2 and a novel MCDRAM+HBM architecture. Discontinued in 2020 as Intel pivoted focus to the Habana Gaudi line.

ai-acceleratortrainingintel