Skip to main content
Knowledge Index

Explore.

7,960 AI entities indexed across tools, models, agents, skills, benchmarks, and more — schema-verified, agent-maintained.

100 entities · model

ModelLLMs

GPT-5

by OpenAI

OpenAI's frontier model with advanced reasoning, native multimodal understanding, and robust function calling. Designed for complex enterprise workflows and agentic applications.

llmreasoningmultimodal
78.7B+
ModelLLMs

GPT-4o

by OpenAI

OpenAI's natively multimodal flagship model processing text, image, and audio inputs with a single unified architecture. Delivers GPT-4 Turbo-level intelligence at 2x speed and 50% lower cost, with breakthrough real-time voice capabilities.

llmmultimodalomni
78.1B+
ModelLLMs

Claude 4

by Anthropic

Anthropic's most capable model featuring advanced reasoning, coding, and multimodal capabilities. Excels at complex analysis, agentic tasks, and extended thinking with industry-leading safety.

llmreasoningcoding
78B+
ModelLLMs

GPT-4

by OpenAI

OpenAI's breakthrough large language model that demonstrated a significant leap in reasoning and factual accuracy over GPT-3.5. Widely adopted across enterprise and developer workflows for code generation, analysis, and complex problem-solving.

llmreasoningmultimodal
77.9B+
ModelLLMs

Claude 3.5 Sonnet

by Anthropic

Anthropic's breakout model that surpassed Claude 3 Opus at Sonnet-tier pricing, setting new industry benchmarks for coding. Introduced computer use capability and became the most popular model on the API due to its exceptional intelligence-to-cost ratio.

llmcodingmultimodal
77.7B+
ModelComputer Vision

Midjourney V6

by Midjourney

Midjourney V6 represents a major leap in photorealism, prompt adherence, and artistic coherence, setting a new industry benchmark for AI image generation quality. It introduced native text rendering within images and dramatically improved its understanding of complex, multi-subject prompts.

image-generationtext-to-imagecreative-ai
77.2B+
ModelSpeech & Audio AI

Whisper V3

by OpenAI

OpenAI's state-of-the-art open-source automatic speech recognition model trained on 680K hours of multilingual audio. Supports 99 languages with near-human accuracy and includes translation, timestamp, and language detection capabilities.

speech-to-texttranscriptionmultilingual
77B+
ModelLLMs

BERT

by Google

BERT (Bidirectional Encoder Representations from Transformers) is Google's landmark 2018 language model that introduced the bidirectional pre-training paradigm using masked language modeling and next sentence prediction. It revolutionized NLP by demonstrating that a single pre-trained model could achieve state-of-the-art results across dozens of downstream tasks with minimal fine-tuning.

foundationalgoogletransformer
76.3B+
ModelLLMs

Gemini 2.5 Pro

by Google DeepMind

Google DeepMind's flagship thinking model with native multimodal understanding across text, images, audio, and video. Excels at complex reasoning, code generation, and agentic tasks with a million-token context window.

llmreasoningmultimodal
76.2B+
ModelComputer Vision

Stable Diffusion XL

by Stability AI

Stability AI's high-resolution image generation model producing photorealistic and artistic images at 1024x1024 resolution. Features a two-stage architecture with a base model and refiner for enhanced detail and compositional quality.

image-generationdiffusionopen-source
74.4B+
ModelLLMs

GPT-4 Turbo

by OpenAI

An optimized variant of GPT-4 offering a 128K context window, faster inference, and significantly reduced costs. Introduced JSON mode and improved function calling, making it the preferred GPT-4 variant for production applications.

llmreasoningmultimodal
74.3B+
ModelLLMs

Llama 3.1 70B

by Meta

Meta's workhorse open-source model with 70B parameters, 128K context window, and native tool-use support. Widely deployed as a cost-effective alternative to proprietary frontier models.

llmopen-sourcelarge-model
73.5B+
ModelLLMs

DeepSeek-V3

by DeepSeek

DeepSeek's frontier-class MoE model with 671B total parameters and 37B active, trained using FP8 mixed precision for unprecedented cost efficiency. Matches or exceeds GPT-4o and Claude 3.5 Sonnet on key benchmarks.

llmopen-sourcemoe
72.8B+
ModelLLMs

o1

by OpenAI

OpenAI's first reasoning model that uses extended internal chain-of-thought before responding. Achieves expert-level performance on competitive math (AIME), PhD-level science (GPQA), and complex coding tasks through deliberative alignment.

llmreasoningchain-of-thought
72.6B+
ModelSpeech & Audio AI

ElevenLabs Turbo v2.5

by ElevenLabs

ElevenLabs Turbo v2.5 is a low-latency multilingual text-to-speech model optimized for real-time conversational AI applications, offering sub-400ms first-audio latency while maintaining the high voice cloning fidelity ElevenLabs is known for across 32 languages. It powers a wide range of AI assistant, customer service, and interactive voice applications where natural-sounding, real-time speech is critical.

text-to-speechvoice-cloninglow-latency
72.4B+
ModelLLMs

Llama 3.1 405B

by Meta

The largest openly available language model at 405 billion parameters, rivaling proprietary frontier models in reasoning and knowledge. A landmark release demonstrating open-source models can match closed alternatives.

llmopen-sourcefrontier
72.2B+
ModelComputer Vision

DALL-E 3

by OpenAI

OpenAI's most advanced image generation model with native ChatGPT integration. Features dramatically improved prompt following, text rendering, and safety mitigations compared to DALL-E 2, generating high-fidelity images from natural language descriptions.

image-generationtext-to-imagecreative
72.2B+
ModelLLMs

Claude 4 Sonnet

by Anthropic

Anthropic's balanced Claude 4 generation model delivering strong coding and reasoning at competitive pricing. Features improved agentic capabilities and extended thinking, offering a compelling mid-tier option between Haiku and Opus.

llmcodingmultimodal
72.2B+
ModelLLMs

Llama 3 70B

by Meta

Meta's high-performance 70B parameter model closing the gap with proprietary frontier models. Achieved competitive results on major benchmarks while remaining fully open-source.

llmopen-sourcelarge-model
72.05B+
ModelLLMs

Claude 4.5 Sonnet

by Anthropic

Anthropic's most advanced Sonnet-tier model, combining frontier intelligence with practical speed and cost. Features state-of-the-art coding performance, improved extended thinking, and robust agentic capabilities for complex multi-step workflows.

llmcodingmultimodal
71.1B+
ModelLLMs

GPT-2

by OpenAI

GPT-2 is OpenAI's 2019 autoregressive language model that demonstrated for the first time that large-scale unsupervised pre-training on internet text could produce coherent, fluent long-form text generation with zero-shot task performance. Its initial withheld release sparked global debate about AI safety and responsible disclosure of capable AI systems.

foundationalopenaiautoregressive
70.8B+
ModelLLMs

Gemini 2.5 Flash

by Google DeepMind

Google DeepMind's fast thinking model optimized for speed and cost efficiency while retaining strong reasoning capabilities. Supports a million-token context window with native multimodal input.

llmfast-inferencemultimodal
70.7B+
ModelLLMs

Gemini 2.0 Flash

by Google

Google's next-generation fast model built for the agentic era, featuring native tool use, multimodal generation, and real-time streaming. Outperforms Gemini 1.5 Pro on key benchmarks while maintaining Flash-tier speed and cost efficiency.

llmfastmultimodal
70.7B+
Modelother

AlphaFold 3

by Google DeepMind

AlphaFold 3 is Google DeepMind's third-generation protein structure prediction model that extends beyond proteins to predict the structures of DNA, RNA, and small molecules and their interactions. It represents a revolutionary tool for drug discovery and structural biology, dramatically accelerating our understanding of molecular machines that underpin life.

foundationaldeepmindprotein-structure
70.6B+
ModelSpeech & Audio AI

Google WaveNet

by Google / DeepMind

Google WaveNet is DeepMind's pioneering generative model for raw audio waveforms that dramatically advanced the state of the art in text-to-speech naturalness when published in 2016 and continues to power Google Assistant, Google Cloud TTS, and various Google products at massive scale. Its autoregressive waveform generation approach established the template for neural vocoder research and inspired a generation of TTS architectures.

text-to-speechwavenetgoogle
70.5B+
ModelLLMs

Mistral 7B

by Mistral AI

Mistral AI's breakthrough 7B parameter model that outperformed Llama 2 13B across all benchmarks at launch. Introduced sliding window attention and grouped-query attention for efficient inference.

llmopen-sourcesmall-model
70.4B+
ModelLLMs

Gemini 1.5 Pro

by Google

Google's mid-size multimodal model featuring a groundbreaking 2 million token context window using mixture-of-experts architecture. Excels at long-document understanding, video analysis, and cross-modal reasoning tasks that require processing large volumes of information.

llmlong-contextmultimodal
70.4B+
ModelLLMs

GPT-4o mini

by OpenAI

OpenAI's most cost-efficient small model, replacing GPT-3.5 Turbo as the default lightweight option. Scores 82% on MMLU and outperforms GPT-4 on chat preferences while costing over 60% less than GPT-4o.

llmlightweightcost-efficient
70.35B+
ModelComputer Vision

FLUX 1.1 Pro

by Black Forest Labs

FLUX 1.1 Pro from Black Forest Labs is a next-generation text-to-image model built by the original creators of Stable Diffusion, offering superior prompt comprehension, anatomical accuracy, and photorealistic detail. It sets a new open-weights standard with exceptional speed and quality, available in Pro, Dev, and Schnell variants for different use cases.

image-generationtext-to-imageopen-source
70.1B+
ModelLLMs

T5

by Google

T5 (Text-To-Text Transfer Transformer) is Google's 2019 framework that reframes all NLP tasks as text-to-text problems, allowing a single model to be trained on a unified mixture of tasks. Its clean formulation and the C4 dataset became foundational references for multitask learning research, and T5 variants remain widely used in production and research.

foundationalgoogleencoder-decoder
69.7B
ModelLLMs

GPT-4V

by OpenAI

OpenAI's multimodal extension of GPT-4 with native vision capabilities for image understanding, OCR, and visual reasoning. Processes interleaved text and images for tasks ranging from chart analysis to visual question answering.

multimodalvisionopenai
69.6B
ModelSpeech & Audio AI

Suno V3.5

by Suno AI

Suno V3.5 is a text-to-song AI model that generates complete, radio-quality music tracks with vocals, instrumentation, and song structure directly from natural language prompts or custom lyrics. It supports an enormous range of genres and styles and is widely regarded as the most accessible and highest-quality text-to-music system for non-musicians.

music-generationtext-to-musicvocals
69.4B
ModelLLMs

Mixtral 8x7B

by Mistral AI

Mistral AI's sparse mixture-of-experts model using 8 expert networks of 7B parameters each, activating only 2 per token. Matches GPT-3.5 performance while using a fraction of the compute at inference.

llmopen-sourcemoe
69.4B
ModelLLMs

Qwen 2.5 72B

by Alibaba Cloud

The flagship open-weight model in the Qwen 2.5 series, offering substantial improvements in reasoning, instruction following, and structured output over its predecessor. Supports 128K context with strong performance across 29+ languages.

llmmultilingualopen-weight
69.3B
ModelLLMs

DeepSeek Coder V3

by DeepSeek

DeepSeek Coder V3 is DeepSeek's third-generation code-specialized model, trained on over 2 trillion tokens of code and natural language with a mixture-of-experts architecture. It achieves state-of-the-art performance on major coding benchmarks, surpassing GPT-4o and Claude 3.5 Sonnet on several code generation tasks.

deepseekcodeopen-source
69.2B
ModelLLMs

Llama 3.3 70B

by Meta

Meta's refined 70B model delivering performance comparable to the much larger 405B variant through improved training techniques. Offers the best performance-to-cost ratio in the Llama family.

llmopen-sourcelarge-model
68.95B
ModelLLMs

Llama 3 8B

by Meta

Meta's third-generation compact language model with significantly improved performance over Llama 2 at the same size class. Features an expanded 128K token vocabulary and improved tokenizer.

llmopen-sourcesmall-model
68.9B
ModelLLMs

o3-mini

by OpenAI

A compact and cost-efficient reasoning model that delivers strong STEM performance at a fraction of o3's cost. Supports configurable reasoning effort (low/medium/high) to balance speed and accuracy for different use cases.

llmreasoningcost-efficient
68.5B
ModelLLMs

Claude 3 Opus

by Anthropic

Anthropic's most intelligent model at launch of the Claude 3 family, excelling at highly complex tasks requiring deep reasoning and nuanced understanding. Set new benchmarks in graduate-level reasoning and demonstrated near-human comprehension across academic subjects.

llmreasoningmultimodal
68.5B
ModelLLMs

Llama 2 70B

by Meta

Meta's largest Llama 2 variant with 70 billion parameters delivering substantially improved reasoning and knowledge over the 7B version. Became the de facto open-source baseline for LLM research.

llmopen-sourcelarge-model
68.4B
ModelLLMs

Llama 2 7B

by Meta

Llama 2 7B is an open-source 7 billion parameter large language model developed by Meta. Optimized for dialogue and general text generation, its permissive license and manageable size have made it a popular foundational model for fine-tuning, research, and building custom NLP applications.

llmopen-sourcemeta-ai
68.3B
ModelComputer Vision

Sora

by OpenAI

Sora is a text-to-video diffusion transformer model by OpenAI that generates high-fidelity, minute-long videos from textual prompts. It demonstrates an advanced understanding of language and the physical world, enabling complex scenes with multiple characters, specific motions, and coherent narratives.

video-generationtext-to-videoopenai
68B
ModelLLMs

Llama 3.1 8B

by Meta

Llama 3.1 8B is a compact, open-source language model from Meta, featuring a 128K token context window and native tool-use capabilities. It is optimized for high performance in instruction-following and reasoning tasks, making it a cost-effective solution for scalable, on-device, or resource-constrained applications.

llmopen-sourcesmall-model
67.9B
ModelComputer Vision

Stable Diffusion 3

by Stability AI

Stable Diffusion 3 is a powerful text-to-image model using a Multimodal Diffusion Transformer (MMDiT) architecture. It excels at generating images with unprecedented text quality, adhering closely to complex prompts, and achieving high photorealism and compositional accuracy compared to its predecessors.

image-generationdiffusiontext-to-image
67.55B
ModelSpeech & Audio AI

Azure Neural TTS

by Microsoft

Azure Neural TTS is Microsoft's enterprise-grade text-to-speech service, part of Azure AI Speech. It provides 400+ natural-sounding voices across 140+ languages, with detailed prosody control via SSML. The service is designed for scalable applications, from accessibility tools to customer service bots.

text-to-speechneural-ttsazure-ai
67.2B
ModelComputer Vision

Adobe Firefly 3

by Adobe

Adobe Firefly 3 is a commercially safe generative image model trained exclusively on licensed Adobe Stock and public-domain content, making it uniquely suitable for professional and enterprise creative workflows. Its deep integration with Photoshop, Illustrator, and Express enables AI-powered generation directly within industry-standard design tools.

image-generationtext-to-imagecommercial-safe
66.8B
ModelLLMs

Codex-2

by OpenAI

Codex-2 is OpenAI's second-generation code-specialized model, significantly advancing code completion, synthesis, and debugging over the original Codex. It underpins GitHub Copilot's next-generation features and supports a wider range of programming languages and frameworks.

openaicodecode-generation
66.8B
ModelLLMs

ClinicalBERT

by Kexin Huang et al. (Academic)

ClinicalBERT is a BERT-based model pre-trained on clinical notes from the MIMIC-III dataset. It provides a deep contextual understanding of electronic health record (EHR) text and clinical documentation, serving as a foundational model for various clinical natural language processing tasks.

clinical-nlptransformer-modelbert
66.4B
ModelLLMs

Gemini 2.5 Ultra

by Google DeepMind

Gemini 2.5 Ultra is Google DeepMind's most capable model in the 2.5 generation, designed for the most demanding reasoning, coding, and multimodal tasks. It features an extended context window and advanced chain-of-thought capabilities surpassing prior Gemini variants.

googledeepmindfrontier
66B
ModelLLMs

Claude Opus 4

by Anthropic

Anthropic's most capable model in the Claude 4 generation, designed for the most demanding reasoning, analysis, and agentic tasks. Excels at complex multi-step problems requiring deep understanding and sustained coherence across long contexts.

llmreasoningfrontier
65.8B
ModelLLMs

Gemini 1.5 Flash

by Google

Google's lightweight and fast multimodal model optimized for high-volume, cost-sensitive workloads. Supports a 1 million token context window with natively multimodal capabilities across text, image, audio, and video at a fraction of Pro's cost.

llmfastmultimodal
65.6B
ModelLLMs

Cohere Embed v3

by Cohere

Cohere's state-of-the-art embedding model supporting 100+ languages with native int8 and binary quantization for efficient storage. Produces high-quality vector representations optimized for search, classification, and clustering tasks.

embeddingssemantic-searchrag
65.6B
ModelLLMs

Grok-3

by xAI

Grok-3 is xAI's frontier model, delivering state-of-the-art performance in math, science, and coding. Trained on the Colossus supercluster, it features DeepSearch for multi-step research and a 'Think' mode for extended chain-of-thought reasoning, enabling it to tackle complex, real-world problems with access to real-time information.

llmfrontier-modelreasoning-engine
65.55B
ModelAI for Code

DeepSeek-Coder-V2

by DeepSeek

DeepSeek-Coder-V2 is a powerful open-source Mixture-of-Experts (MoE) model specialized in code. It supports 338 programming languages and features advanced fill-in-the-middle capabilities, offering performance comparable to top-tier proprietary models like GPT-4 Turbo at a significantly lower inference cost.

code-generationopen-sourcemoe
65.4B
ModelLLMs

Claude 3.5 Haiku

by Anthropic

Anthropic's fastest, most affordable model in the 3.5 generation, offering performance comparable to Claude 3 Opus. It excels at coding, complex workflows, and agentic tasks due to its advanced tool-use capabilities and speed, making it ideal for high-throughput applications and enterprise automation.

llmfastcost-efficient
65.4B
ModelComputer Vision

Runway Gen-3 Alpha

by Runway

Runway Gen-3 Alpha is a professional-grade video generation model for high-fidelity, temporally consistent clips. It offers fine-grained control over motion, style, and camera behavior via text and image inputs, making it a key tool in professional film and advertising workflows for meeting commercial standards.

video-generationtext-to-videoimage-to-video
65.3B
ModelLLMs

Qwen 2 72B

by Alibaba Cloud

Qwen2-72B is a 72-billion parameter large language model from Alibaba's Qwen2 series. It offers state-of-the-art performance, particularly in multilingual understanding, reasoning, and coding tasks. As an open-weight model, it provides a powerful alternative to proprietary systems for a wide range of applications.

llmopen-weightmultilingual
65.2B
ModelLLMs

Claude 3 Sonnet

by Anthropic

The balanced mid-tier model in the Claude 3 family, offering a strong combination of speed and intelligence. Provides enterprise-grade performance for coding, analysis, and content generation at moderate cost.

llmbalancedmultimodal
65.2B
Modelother

AlphaGo

by Google DeepMind

AlphaGo is a landmark AI from DeepMind that mastered the game of Go. It combines deep neural networks with Monte Carlo Tree Search and reinforcement learning, famously defeating world champion Lee Sedol in 2016. Its success demonstrated AI's ability to tackle complex problems requiring strategic planning.

foundationaldeepmindreinforcement-learning
64.8B
ModelAI for Code

Qwen 2.5 Coder 32B

by Alibaba Cloud

Qwen 2.5 Coder 32B is an open-weight, code-specialized large language model from Alibaba Cloud. Fine-tuned on a massive corpus covering over 92 programming languages, it excels at code generation, completion, and debugging tasks, demonstrating performance on par with or exceeding proprietary models like GPT-4o on several benchmarks.

code-llmopen-weightcode-generation
64.7B
ModelLLMs

Claude 3 Haiku

by Anthropic

Claude 3 Haiku is Anthropic's fastest, most compact model, excelling at near-instant responsiveness. It handles a wide range of tasks, including multimodal vision, with strong performance at a low cost, making it ideal for high-throughput applications like content moderation and customer service.

llmhigh-speedcost-efficient
64.7B
ModelSpeech & Audio AI

MusicGen

by Meta AI

MusicGen is an open-source text-to-music model from Meta AI that generates high-quality instrumental music from text descriptions. It can also be conditioned on a melody reference, providing a strong, controllable baseline for both research and commercial applications, trained on 20K hours of licensed music.

music-generationtext-to-musicopen-source
64.5B
ModelLLMs

Mixtral 8x22B

by Mistral AI

Mixtral 8x22B is a large-scale, open-source Mixture-of-Experts (MoE) model from Mistral AI. It features 176 billion total parameters but only activates 39 billion per token, balancing immense power with efficiency. The model excels at reasoning, code generation, and multilingual tasks, and includes native function calling capabilities.

llmopen-sourcemoe
64.4B
ModelLLMs

Mistral Large

by Mistral AI

Mistral Large is Mistral AI's flagship proprietary model, offering top-tier reasoning and multilingual capabilities. It is designed to compete with other frontier models like GPT-4, excelling in complex tasks that require deep understanding. Its native function calling and fluency in over 30 languages make it highly versatile for enterprise-grade applications.

llmproprietary-modelapi-access
64.3B
ModelAI for Code

Code Llama 34B

by Meta

Code Llama 34B is a large language model from Meta, fine-tuned from Llama 2 for code-specific tasks. It excels at generating, completing, and explaining code across various languages. With variants supporting a 100K token context window, it can analyze and work with extensive codebases for complex tasks like refactoring.

code-llmopen-sourcecode-generation
64.3B
Modelembeddings

Multilingual-E5-Large

by Microsoft Research

Multilingual-E5-Large is a powerful text embedding model from Microsoft supporting 100 languages. Trained on billions of text pairs using contrastive learning, it excels at cross-lingual information retrieval and semantic similarity, establishing a strong open-source baseline for multilingual NLP tasks.

text-embeddingmultilingualcross-lingual
64.2B
ModelLLMs

Med-PaLM 2

by Google

Med-PaLM 2 is Google's large language model specialized for the medical domain. It achieves expert-level performance on medical licensing exams (USMLE) by leveraging advanced clinical reasoning and question-answering capabilities. The model is designed to generate accurate and helpful responses for healthcare professionals.

medical-aiclinical-decision-supportllm
64.2B
Modelmultimodal

Qwen2.5-VL-72B

by Alibaba Cloud (Qwen Team)

Qwen2.5-VL-72B is Alibaba's flagship open vision-language model at 72 billion parameters, achieving top-tier performance on visual understanding benchmarks including chart analysis, document parsing, and fine-grained image understanding. It supports dynamic resolution image inputs and video understanding with native high-resolution processing.

alibabaqwenvision-language
64B
ModelLLMs

GPT-4.5

by OpenAI

GPT-4.5 is a hypothetical large language model from OpenAI, positioned as a research preview before GPT-5. It focuses on large-scale unsupervised learning to significantly reduce hallucinations and enhance factual accuracy. The model is also designed for improved creative writing and greater emotional intelligence in its responses.

llmreasoningmultimodal
64B
ModelLLMs

Phi-3.5-mini

by Microsoft

Phi-3.5-mini is a 3.8B parameter instruction-tuned model from Microsoft, optimized for edge and mobile devices. Despite its compact size, it delivers performance comparable to much larger models on benchmarks for reasoning, coding, and language tasks, making it highly efficient for on-device AI applications.

small-language-modelon-device-aiedge-computing
63.9B
ModelLLMs

o1-mini

by OpenAI

A smaller, faster, and more affordable reasoning model optimized for STEM tasks. Delivers 80% of o1's reasoning capability at roughly 80% lower cost, making it ideal for high-volume coding and math workloads.

llmreasoningmath
63.9B
ModelLLMs

PaLM

by Google

PaLM (Pathways Language Model) is Google's 540 billion parameter language model trained using the Pathways system across 6,144 TPU v4 chips, demonstrating breakthrough capabilities on chain-of-thought reasoning, code generation, and multilingual tasks. It introduced the concept of 'discontinuous' capability jumps at scale and set new benchmarks on hundreds of NLP tasks upon release in 2022.

foundationalgooglepathways
63.3B
ModelComputer Vision

Ideogram 2

by Ideogram AI

Ideogram 2 is a text-to-image model renowned for its superior ability to render legible and accurate text within generated images. It excels at creating high-quality photorealistic and artistic visuals with strong prompt adherence, making it a powerful tool for design, branding, and creative projects.

text-to-imageimage-generationtypography
63.2B
ModelSpeech & Audio AI

Amazon Polly Neural

by Amazon Web Services

Amazon Polly is a cloud-based text-to-speech (TTS) service from AWS that produces highly natural-sounding human speech using neural engine technology. It supports over 30 languages with both standard and neural voices, offering deep integration with the AWS ecosystem for scalable production applications.

text-to-speechcloud-ttsenterprise
63B
ModelLLMs

Claude Opus 4.5

by Anthropic

Claude Opus 4.5 is Anthropic's frontier AI model, delivering state-of-the-art performance in complex reasoning, creative tasks, and nuanced understanding. It features advanced multimodal vision capabilities for analyzing images and documents, along with extended thinking for multi-step, agentic tasks.

llmfrontier-modelmultimodal-ai
62.9B
ModelSpeech & Audio AI

TTS-1

by OpenAI

OpenAI's TTS-1 is a text-to-speech model designed for real-time audio generation. It provides six distinct, natural-sounding preset voices and supports low-latency streaming, making it ideal for interactive applications. A higher-quality variant, tts-1-hd, is available for tasks where audio fidelity is prioritized over speed.

text-to-speechvoice-synthesisaudio-generation
62.7B
ModelLLMs

Command R+

by Cohere

Cohere's most capable RAG-optimized model, offering significantly enhanced reasoning, multi-step tool use, and superior grounded generation over Command R. Designed for complex enterprise workflows requiring high accuracy and citations.

llmragenterprise
62.7B
ModelComputer Vision

Imagen 3

by Google DeepMind

Google DeepMind's highest-quality text-to-image generation model producing photorealistic images with improved detail, lighting, and fewer artifacts. Features enhanced prompt understanding and safety filtering.

image-generationdiffusiontext-to-image
62.65B
ModelLLMs

Qwen 2.5 Max

by Alibaba Cloud

Alibaba Cloud's most capable proprietary model in the Qwen 2.5 family, optimized for complex reasoning and enterprise applications. Available exclusively through Alibaba Cloud's Model Studio API with enhanced safety and alignment.

llmproprietaryreasoning
62.6B
ModelSpeech & Audio AI

AudioCraft

by Meta AI

AudioCraft is an open-source generative audio framework from Meta AI. It integrates MusicGen for music, AudioGen for sound effects, and the EnCodec codec into a single platform. This unified, modular design allows for text-to-audio generation and has become a key reference for audio LLM research.

audio-generationmusic-generationsound-effects
62.6B
ModelLLMs

LegalBERT

by Ilias Chalkidis et al. (Academic)

LegalBERT is a family of BERT models pre-trained on a diverse corpus of English legal texts, including legislation, court cases, and contracts. This specialized training allows it to significantly outperform general-purpose BERT models on downstream legal NLP tasks, establishing it as a foundational baseline for legal AI research and applications.

legal-techberttransformer-model
62.5B
ModelLLMs

Gemma 2 9B

by Google DeepMind

Gemma 2 9B is a lightweight, state-of-the-art open model from Google, part of the next generation of the Gemma family. It offers strong performance for its size class, making it ideal for environments with limited computational resources. Built on a new architecture, it is optimized for on-device applications, research, and fine-tuning.

llmopen-weightssmall-model
62.2B
ModelLLMs

QwQ-32B

by Alibaba / Qwen Team

QwQ-32B is a 32 billion parameter language model from Alibaba, specifically optimized for complex reasoning tasks. It utilizes a deep chain-of-thought methodology to excel at mathematical, scientific, and logical problems, achieving performance comparable to much larger models and showcasing high parameter efficiency.

reasoningqwenalibaba
62B
ModelLLMs

BLOOM

by BigScience Workshop

BLOOM is a 176 billion parameter, open-access multilingual language model developed by the BigScience research workshop. Trained on 46 natural languages and 13 programming languages, it provides powerful text and code generation capabilities, making it a key resource for researchers and developers building multilingual AI applications.

foundational-modelbigsciencemultilingual
61.6B
ModelAI for Code

StarCoder2 15B

by BigCode (ServiceNow + Hugging Face)

StarCoder2 15B is a powerful open-source code generation model from the BigCode project. Trained on The Stack v2 dataset spanning over 600 programming languages, it excels at code completion, generation, and fill-in-the-middle tasks, emphasizing data transparency and author opt-out.

code-llmopen-sourcecode-generation
61.5B
ModelLLMs

Phi-3 Mini

by Microsoft

Microsoft's Phi-3 Mini is a 3.8 billion parameter small language model (SLM) designed for high performance on resource-constrained devices. Despite its compact size, it exhibits strong reasoning and language understanding capabilities, making it suitable for on-device and edge AI applications. It is optimized for efficient inference.

slmopen-weightedge-ai
61.5B
ModelLLMs

Cohere Rerank v3

by Cohere

Cohere Rerank v3 is a state-of-the-art neural model designed to significantly boost the relevance of search results for Retrieval-Augmented Generation (RAG) systems. It re-scores a list of candidate documents from any keyword or vector search system, identifying the most pertinent information. It supports over 100 languages and can process long documents, making it highly versatile.

rerankingsearchrag
61.45B
ModelAI for Code

DeepSeek Coder 33B

by DeepSeek

DeepSeek Coder 33B is a dense, open-source large language model specializing in code-related tasks. Trained from scratch on a massive 2 trillion token dataset of code and natural language, it understands project-level context and supports 87 different programming languages for advanced code generation and completion.

code-generationopen-sourcedense-model
61.2B
ModelLLMs

Llama 3.2 11B Vision

by Meta

Llama 3.2 11B Vision is Meta's first open-source multimodal model, integrating native image understanding with advanced text generation. At a compact 11B parameters, it's designed for efficiency, enabling visual question answering, image captioning, and complex reasoning across text and images in a single, deployable model.

llmopen-sourcemultimodal
60.8B
ModelLLMs

DeepSeek-V2

by DeepSeek

DeepSeek's mixture-of-experts model introducing Multi-head Latent Attention (MLA) for dramatically reduced inference cost. Activates 21B of its 236B total parameters per token while matching larger dense models.

llmopen-sourcemoe
60.8B
ModelAI for Code

Codestral

by Mistral AI

Codestral is Mistral AI's open-weight generative model explicitly designed for code generation tasks. Trained on a diverse dataset of over 80 programming languages, it excels at code completion, generation, and its unique fill-in-the-middle capability. It is optimized for low-latency performance in real-world applications.

code-generationopen-weightfill-in-middle
60.65B
ModelLLMs

Gemma 2 27B

by Google DeepMind

Gemma 2 27B is a powerful, mid-sized open-weights model from Google DeepMind. It delivers significant performance gains in reasoning, coding, and instruction following over smaller variants. Designed for server-side deployment, it provides a strong foundation for advanced research and custom fine-tuning projects.

llmopen-weightsgoogle
60.25B
ModelLLMs

Claude 4.5 Haiku

by Anthropic

Claude 4.5 Haiku is Anthropic's fastest and most compact model, engineered for near-instant responsiveness and high-throughput workloads. It provides enterprise-grade performance at a fraction of the cost, making it ideal for real-time interactions, content moderation, and cost-effective agentic tasks.

llmanthropicclaude-4.5-haiku
60.1B
ModelSpeech & Audio AI

XTTS-v2

by Coqui AI

XTTS-v2 is an open-source, cross-lingual text-to-speech model from Coqui AI. It excels at high-quality voice cloning from just a few seconds of audio and supports 17 languages. With real-time streaming inference, it's ideal for applications needing custom voices and low-latency output.

text-to-speechvoice-cloningmultilingual-tts
59.6C+
ModelLLMs

BloombergGPT

by Bloomberg

BloombergGPT is a 50-billion parameter large language model developed by Bloomberg. It is specifically trained on a massive, curated corpus of financial data accumulated over decades, combined with general-purpose datasets. This specialized training allows it to excel at financial natural language processing tasks, outperforming similarly sized general models.

financefinancial-nlpdomain-specific
59.6C+
ModelLLMs

Grok-2

by xAI

Grok-2 is xAI's second-generation large language model, notable for its real-time knowledge access through the X platform. It possesses strong reasoning and multimodal capabilities, including vision understanding. The model is designed for a more natural, conversational interaction style with a lower tendency to refuse prompts.

large-language-modelgenerative-aixai
59.45C+
ModelLLMs

BioGPT

by Microsoft Research

BioGPT is a domain-specific language model from Microsoft, pre-trained on a massive corpus of biomedical literature from PubMed. It excels at tasks like generating biomedical text, extracting relationships between entities, and answering questions based on medical research, achieving state-of-the-art results on several benchmarks.

biomedicalnlppubmed
59.1C+
ModelLLMs

Command R

by Cohere

Command R is a retrieval-optimized language model from Cohere, specifically designed for enterprise-grade Retrieval-Augmented Generation (RAG) and tool use. It excels in multilingual applications, supporting over 10 languages, and features built-in capabilities for grounding responses and generating citations to ensure accuracy.

llmragenterprise-ai
59.05C+
ModelLLMs

Gemma 2B

by Google DeepMind

Gemma 2B is Google DeepMind's open-weight 2 billion parameter language model from the Gemma family, designed for lightweight deployment on devices with limited resources. It delivers strong performance for its size on language understanding and generation tasks, and serves as a foundation for fine-tuning on domain-specific tasks.

googlesmalledge
59C+
ModelComputer Vision

Pika 1.5

by Pika Labs

Pika 1.5 is an accessible AI video generation model that transforms text prompts or images into high-quality videos. It is known for its expressive motion, diverse cinematic styles, and unique features like physics-based effects and automated lip-sync, making it popular among creators and consumers.

video-generationtext-to-videoimage-to-video
58.6C+