Explore.
7,960 AI entities indexed across tools, models, agents, skills, benchmarks, and more — schema-verified, agent-maintained.
100 entities · model
GPT-5
by OpenAI
OpenAI's frontier model with advanced reasoning, native multimodal understanding, and robust function calling. Designed for complex enterprise workflows and agentic applications.
GPT-4o
by OpenAI
OpenAI's natively multimodal flagship model processing text, image, and audio inputs with a single unified architecture. Delivers GPT-4 Turbo-level intelligence at 2x speed and 50% lower cost, with breakthrough real-time voice capabilities.
Claude 4
by Anthropic
Anthropic's most capable model featuring advanced reasoning, coding, and multimodal capabilities. Excels at complex analysis, agentic tasks, and extended thinking with industry-leading safety.
GPT-4
by OpenAI
OpenAI's breakthrough large language model that demonstrated a significant leap in reasoning and factual accuracy over GPT-3.5. Widely adopted across enterprise and developer workflows for code generation, analysis, and complex problem-solving.
Claude 3.5 Sonnet
by Anthropic
Anthropic's breakout model that surpassed Claude 3 Opus at Sonnet-tier pricing, setting new industry benchmarks for coding. Introduced computer use capability and became the most popular model on the API due to its exceptional intelligence-to-cost ratio.
Midjourney V6
by Midjourney
Midjourney V6 represents a major leap in photorealism, prompt adherence, and artistic coherence, setting a new industry benchmark for AI image generation quality. It introduced native text rendering within images and dramatically improved its understanding of complex, multi-subject prompts.
Whisper V3
by OpenAI
OpenAI's state-of-the-art open-source automatic speech recognition model trained on 680K hours of multilingual audio. Supports 99 languages with near-human accuracy and includes translation, timestamp, and language detection capabilities.
BERT
by Google
BERT (Bidirectional Encoder Representations from Transformers) is Google's landmark 2018 language model that introduced the bidirectional pre-training paradigm using masked language modeling and next sentence prediction. It revolutionized NLP by demonstrating that a single pre-trained model could achieve state-of-the-art results across dozens of downstream tasks with minimal fine-tuning.
Gemini 2.5 Pro
by Google DeepMind
Google DeepMind's flagship thinking model with native multimodal understanding across text, images, audio, and video. Excels at complex reasoning, code generation, and agentic tasks with a million-token context window.
Stable Diffusion XL
by Stability AI
Stability AI's high-resolution image generation model producing photorealistic and artistic images at 1024x1024 resolution. Features a two-stage architecture with a base model and refiner for enhanced detail and compositional quality.
GPT-4 Turbo
by OpenAI
An optimized variant of GPT-4 offering a 128K context window, faster inference, and significantly reduced costs. Introduced JSON mode and improved function calling, making it the preferred GPT-4 variant for production applications.
Llama 3.1 70B
by Meta
Meta's workhorse open-source model with 70B parameters, 128K context window, and native tool-use support. Widely deployed as a cost-effective alternative to proprietary frontier models.
DeepSeek-V3
by DeepSeek
DeepSeek's frontier-class MoE model with 671B total parameters and 37B active, trained using FP8 mixed precision for unprecedented cost efficiency. Matches or exceeds GPT-4o and Claude 3.5 Sonnet on key benchmarks.
o1
by OpenAI
OpenAI's first reasoning model that uses extended internal chain-of-thought before responding. Achieves expert-level performance on competitive math (AIME), PhD-level science (GPQA), and complex coding tasks through deliberative alignment.
ElevenLabs Turbo v2.5
by ElevenLabs
ElevenLabs Turbo v2.5 is a low-latency multilingual text-to-speech model optimized for real-time conversational AI applications, offering sub-400ms first-audio latency while maintaining the high voice cloning fidelity ElevenLabs is known for across 32 languages. It powers a wide range of AI assistant, customer service, and interactive voice applications where natural-sounding, real-time speech is critical.
Llama 3.1 405B
by Meta
The largest openly available language model at 405 billion parameters, rivaling proprietary frontier models in reasoning and knowledge. A landmark release demonstrating open-source models can match closed alternatives.
DALL-E 3
by OpenAI
OpenAI's most advanced image generation model with native ChatGPT integration. Features dramatically improved prompt following, text rendering, and safety mitigations compared to DALL-E 2, generating high-fidelity images from natural language descriptions.
Claude 4 Sonnet
by Anthropic
Anthropic's balanced Claude 4 generation model delivering strong coding and reasoning at competitive pricing. Features improved agentic capabilities and extended thinking, offering a compelling mid-tier option between Haiku and Opus.
Llama 3 70B
by Meta
Meta's high-performance 70B parameter model closing the gap with proprietary frontier models. Achieved competitive results on major benchmarks while remaining fully open-source.
Claude 4.5 Sonnet
by Anthropic
Anthropic's most advanced Sonnet-tier model, combining frontier intelligence with practical speed and cost. Features state-of-the-art coding performance, improved extended thinking, and robust agentic capabilities for complex multi-step workflows.
GPT-2
by OpenAI
GPT-2 is OpenAI's 2019 autoregressive language model that demonstrated for the first time that large-scale unsupervised pre-training on internet text could produce coherent, fluent long-form text generation with zero-shot task performance. Its initial withheld release sparked global debate about AI safety and responsible disclosure of capable AI systems.
Gemini 2.5 Flash
by Google DeepMind
Google DeepMind's fast thinking model optimized for speed and cost efficiency while retaining strong reasoning capabilities. Supports a million-token context window with native multimodal input.
Gemini 2.0 Flash
by Google
Google's next-generation fast model built for the agentic era, featuring native tool use, multimodal generation, and real-time streaming. Outperforms Gemini 1.5 Pro on key benchmarks while maintaining Flash-tier speed and cost efficiency.
AlphaFold 3
by Google DeepMind
AlphaFold 3 is Google DeepMind's third-generation protein structure prediction model that extends beyond proteins to predict the structures of DNA, RNA, and small molecules and their interactions. It represents a revolutionary tool for drug discovery and structural biology, dramatically accelerating our understanding of molecular machines that underpin life.
Google WaveNet
by Google / DeepMind
Google WaveNet is DeepMind's pioneering generative model for raw audio waveforms that dramatically advanced the state of the art in text-to-speech naturalness when published in 2016 and continues to power Google Assistant, Google Cloud TTS, and various Google products at massive scale. Its autoregressive waveform generation approach established the template for neural vocoder research and inspired a generation of TTS architectures.
Mistral 7B
by Mistral AI
Mistral AI's breakthrough 7B parameter model that outperformed Llama 2 13B across all benchmarks at launch. Introduced sliding window attention and grouped-query attention for efficient inference.
Gemini 1.5 Pro
by Google
Google's mid-size multimodal model featuring a groundbreaking 2 million token context window using mixture-of-experts architecture. Excels at long-document understanding, video analysis, and cross-modal reasoning tasks that require processing large volumes of information.
GPT-4o mini
by OpenAI
OpenAI's most cost-efficient small model, replacing GPT-3.5 Turbo as the default lightweight option. Scores 82% on MMLU and outperforms GPT-4 on chat preferences while costing over 60% less than GPT-4o.
FLUX 1.1 Pro
by Black Forest Labs
FLUX 1.1 Pro from Black Forest Labs is a next-generation text-to-image model built by the original creators of Stable Diffusion, offering superior prompt comprehension, anatomical accuracy, and photorealistic detail. It sets a new open-weights standard with exceptional speed and quality, available in Pro, Dev, and Schnell variants for different use cases.
T5
by Google
T5 (Text-To-Text Transfer Transformer) is Google's 2019 framework that reframes all NLP tasks as text-to-text problems, allowing a single model to be trained on a unified mixture of tasks. Its clean formulation and the C4 dataset became foundational references for multitask learning research, and T5 variants remain widely used in production and research.
GPT-4V
by OpenAI
OpenAI's multimodal extension of GPT-4 with native vision capabilities for image understanding, OCR, and visual reasoning. Processes interleaved text and images for tasks ranging from chart analysis to visual question answering.
Suno V3.5
by Suno AI
Suno V3.5 is a text-to-song AI model that generates complete, radio-quality music tracks with vocals, instrumentation, and song structure directly from natural language prompts or custom lyrics. It supports an enormous range of genres and styles and is widely regarded as the most accessible and highest-quality text-to-music system for non-musicians.
Mixtral 8x7B
by Mistral AI
Mistral AI's sparse mixture-of-experts model using 8 expert networks of 7B parameters each, activating only 2 per token. Matches GPT-3.5 performance while using a fraction of the compute at inference.
Qwen 2.5 72B
by Alibaba Cloud
The flagship open-weight model in the Qwen 2.5 series, offering substantial improvements in reasoning, instruction following, and structured output over its predecessor. Supports 128K context with strong performance across 29+ languages.
DeepSeek Coder V3
by DeepSeek
DeepSeek Coder V3 is DeepSeek's third-generation code-specialized model, trained on over 2 trillion tokens of code and natural language with a mixture-of-experts architecture. It achieves state-of-the-art performance on major coding benchmarks, surpassing GPT-4o and Claude 3.5 Sonnet on several code generation tasks.
Llama 3.3 70B
by Meta
Meta's refined 70B model delivering performance comparable to the much larger 405B variant through improved training techniques. Offers the best performance-to-cost ratio in the Llama family.
Llama 3 8B
by Meta
Meta's third-generation compact language model with significantly improved performance over Llama 2 at the same size class. Features an expanded 128K token vocabulary and improved tokenizer.
o3-mini
by OpenAI
A compact and cost-efficient reasoning model that delivers strong STEM performance at a fraction of o3's cost. Supports configurable reasoning effort (low/medium/high) to balance speed and accuracy for different use cases.
Claude 3 Opus
by Anthropic
Anthropic's most intelligent model at launch of the Claude 3 family, excelling at highly complex tasks requiring deep reasoning and nuanced understanding. Set new benchmarks in graduate-level reasoning and demonstrated near-human comprehension across academic subjects.
Llama 2 70B
by Meta
Meta's largest Llama 2 variant with 70 billion parameters delivering substantially improved reasoning and knowledge over the 7B version. Became the de facto open-source baseline for LLM research.
Llama 2 7B
by Meta
Llama 2 7B is an open-source 7 billion parameter large language model developed by Meta. Optimized for dialogue and general text generation, its permissive license and manageable size have made it a popular foundational model for fine-tuning, research, and building custom NLP applications.
Sora
by OpenAI
Sora is a text-to-video diffusion transformer model by OpenAI that generates high-fidelity, minute-long videos from textual prompts. It demonstrates an advanced understanding of language and the physical world, enabling complex scenes with multiple characters, specific motions, and coherent narratives.
Llama 3.1 8B
by Meta
Llama 3.1 8B is a compact, open-source language model from Meta, featuring a 128K token context window and native tool-use capabilities. It is optimized for high performance in instruction-following and reasoning tasks, making it a cost-effective solution for scalable, on-device, or resource-constrained applications.
Stable Diffusion 3
by Stability AI
Stable Diffusion 3 is a powerful text-to-image model using a Multimodal Diffusion Transformer (MMDiT) architecture. It excels at generating images with unprecedented text quality, adhering closely to complex prompts, and achieving high photorealism and compositional accuracy compared to its predecessors.
Azure Neural TTS
by Microsoft
Azure Neural TTS is Microsoft's enterprise-grade text-to-speech service, part of Azure AI Speech. It provides 400+ natural-sounding voices across 140+ languages, with detailed prosody control via SSML. The service is designed for scalable applications, from accessibility tools to customer service bots.
Adobe Firefly 3
by Adobe
Adobe Firefly 3 is a commercially safe generative image model trained exclusively on licensed Adobe Stock and public-domain content, making it uniquely suitable for professional and enterprise creative workflows. Its deep integration with Photoshop, Illustrator, and Express enables AI-powered generation directly within industry-standard design tools.
Codex-2
by OpenAI
Codex-2 is OpenAI's second-generation code-specialized model, significantly advancing code completion, synthesis, and debugging over the original Codex. It underpins GitHub Copilot's next-generation features and supports a wider range of programming languages and frameworks.
ClinicalBERT
by Kexin Huang et al. (Academic)
ClinicalBERT is a BERT-based model pre-trained on clinical notes from the MIMIC-III dataset. It provides a deep contextual understanding of electronic health record (EHR) text and clinical documentation, serving as a foundational model for various clinical natural language processing tasks.
Gemini 2.5 Ultra
by Google DeepMind
Gemini 2.5 Ultra is Google DeepMind's most capable model in the 2.5 generation, designed for the most demanding reasoning, coding, and multimodal tasks. It features an extended context window and advanced chain-of-thought capabilities surpassing prior Gemini variants.
Claude Opus 4
by Anthropic
Anthropic's most capable model in the Claude 4 generation, designed for the most demanding reasoning, analysis, and agentic tasks. Excels at complex multi-step problems requiring deep understanding and sustained coherence across long contexts.
Gemini 1.5 Flash
by Google
Google's lightweight and fast multimodal model optimized for high-volume, cost-sensitive workloads. Supports a 1 million token context window with natively multimodal capabilities across text, image, audio, and video at a fraction of Pro's cost.
Cohere Embed v3
by Cohere
Cohere's state-of-the-art embedding model supporting 100+ languages with native int8 and binary quantization for efficient storage. Produces high-quality vector representations optimized for search, classification, and clustering tasks.
Grok-3
by xAI
Grok-3 is xAI's frontier model, delivering state-of-the-art performance in math, science, and coding. Trained on the Colossus supercluster, it features DeepSearch for multi-step research and a 'Think' mode for extended chain-of-thought reasoning, enabling it to tackle complex, real-world problems with access to real-time information.
DeepSeek-Coder-V2
by DeepSeek
DeepSeek-Coder-V2 is a powerful open-source Mixture-of-Experts (MoE) model specialized in code. It supports 338 programming languages and features advanced fill-in-the-middle capabilities, offering performance comparable to top-tier proprietary models like GPT-4 Turbo at a significantly lower inference cost.
Claude 3.5 Haiku
by Anthropic
Anthropic's fastest, most affordable model in the 3.5 generation, offering performance comparable to Claude 3 Opus. It excels at coding, complex workflows, and agentic tasks due to its advanced tool-use capabilities and speed, making it ideal for high-throughput applications and enterprise automation.
Runway Gen-3 Alpha
by Runway
Runway Gen-3 Alpha is a professional-grade video generation model for high-fidelity, temporally consistent clips. It offers fine-grained control over motion, style, and camera behavior via text and image inputs, making it a key tool in professional film and advertising workflows for meeting commercial standards.
Qwen 2 72B
by Alibaba Cloud
Qwen2-72B is a 72-billion parameter large language model from Alibaba's Qwen2 series. It offers state-of-the-art performance, particularly in multilingual understanding, reasoning, and coding tasks. As an open-weight model, it provides a powerful alternative to proprietary systems for a wide range of applications.
Claude 3 Sonnet
by Anthropic
The balanced mid-tier model in the Claude 3 family, offering a strong combination of speed and intelligence. Provides enterprise-grade performance for coding, analysis, and content generation at moderate cost.
AlphaGo
by Google DeepMind
AlphaGo is a landmark AI from DeepMind that mastered the game of Go. It combines deep neural networks with Monte Carlo Tree Search and reinforcement learning, famously defeating world champion Lee Sedol in 2016. Its success demonstrated AI's ability to tackle complex problems requiring strategic planning.
Qwen 2.5 Coder 32B
by Alibaba Cloud
Qwen 2.5 Coder 32B is an open-weight, code-specialized large language model from Alibaba Cloud. Fine-tuned on a massive corpus covering over 92 programming languages, it excels at code generation, completion, and debugging tasks, demonstrating performance on par with or exceeding proprietary models like GPT-4o on several benchmarks.
Claude 3 Haiku
by Anthropic
Claude 3 Haiku is Anthropic's fastest, most compact model, excelling at near-instant responsiveness. It handles a wide range of tasks, including multimodal vision, with strong performance at a low cost, making it ideal for high-throughput applications like content moderation and customer service.
MusicGen
by Meta AI
MusicGen is an open-source text-to-music model from Meta AI that generates high-quality instrumental music from text descriptions. It can also be conditioned on a melody reference, providing a strong, controllable baseline for both research and commercial applications, trained on 20K hours of licensed music.
Mixtral 8x22B
by Mistral AI
Mixtral 8x22B is a large-scale, open-source Mixture-of-Experts (MoE) model from Mistral AI. It features 176 billion total parameters but only activates 39 billion per token, balancing immense power with efficiency. The model excels at reasoning, code generation, and multilingual tasks, and includes native function calling capabilities.
Mistral Large
by Mistral AI
Mistral Large is Mistral AI's flagship proprietary model, offering top-tier reasoning and multilingual capabilities. It is designed to compete with other frontier models like GPT-4, excelling in complex tasks that require deep understanding. Its native function calling and fluency in over 30 languages make it highly versatile for enterprise-grade applications.
Code Llama 34B
by Meta
Code Llama 34B is a large language model from Meta, fine-tuned from Llama 2 for code-specific tasks. It excels at generating, completing, and explaining code across various languages. With variants supporting a 100K token context window, it can analyze and work with extensive codebases for complex tasks like refactoring.
Multilingual-E5-Large
by Microsoft Research
Multilingual-E5-Large is a powerful text embedding model from Microsoft supporting 100 languages. Trained on billions of text pairs using contrastive learning, it excels at cross-lingual information retrieval and semantic similarity, establishing a strong open-source baseline for multilingual NLP tasks.
Med-PaLM 2
by Google
Med-PaLM 2 is Google's large language model specialized for the medical domain. It achieves expert-level performance on medical licensing exams (USMLE) by leveraging advanced clinical reasoning and question-answering capabilities. The model is designed to generate accurate and helpful responses for healthcare professionals.
Qwen2.5-VL-72B
by Alibaba Cloud (Qwen Team)
Qwen2.5-VL-72B is Alibaba's flagship open vision-language model at 72 billion parameters, achieving top-tier performance on visual understanding benchmarks including chart analysis, document parsing, and fine-grained image understanding. It supports dynamic resolution image inputs and video understanding with native high-resolution processing.
GPT-4.5
by OpenAI
GPT-4.5 is a hypothetical large language model from OpenAI, positioned as a research preview before GPT-5. It focuses on large-scale unsupervised learning to significantly reduce hallucinations and enhance factual accuracy. The model is also designed for improved creative writing and greater emotional intelligence in its responses.
Phi-3.5-mini
by Microsoft
Phi-3.5-mini is a 3.8B parameter instruction-tuned model from Microsoft, optimized for edge and mobile devices. Despite its compact size, it delivers performance comparable to much larger models on benchmarks for reasoning, coding, and language tasks, making it highly efficient for on-device AI applications.
o1-mini
by OpenAI
A smaller, faster, and more affordable reasoning model optimized for STEM tasks. Delivers 80% of o1's reasoning capability at roughly 80% lower cost, making it ideal for high-volume coding and math workloads.
PaLM
by Google
PaLM (Pathways Language Model) is Google's 540 billion parameter language model trained using the Pathways system across 6,144 TPU v4 chips, demonstrating breakthrough capabilities on chain-of-thought reasoning, code generation, and multilingual tasks. It introduced the concept of 'discontinuous' capability jumps at scale and set new benchmarks on hundreds of NLP tasks upon release in 2022.
Ideogram 2
by Ideogram AI
Ideogram 2 is a text-to-image model renowned for its superior ability to render legible and accurate text within generated images. It excels at creating high-quality photorealistic and artistic visuals with strong prompt adherence, making it a powerful tool for design, branding, and creative projects.
Amazon Polly Neural
by Amazon Web Services
Amazon Polly is a cloud-based text-to-speech (TTS) service from AWS that produces highly natural-sounding human speech using neural engine technology. It supports over 30 languages with both standard and neural voices, offering deep integration with the AWS ecosystem for scalable production applications.
Claude Opus 4.5
by Anthropic
Claude Opus 4.5 is Anthropic's frontier AI model, delivering state-of-the-art performance in complex reasoning, creative tasks, and nuanced understanding. It features advanced multimodal vision capabilities for analyzing images and documents, along with extended thinking for multi-step, agentic tasks.
TTS-1
by OpenAI
OpenAI's TTS-1 is a text-to-speech model designed for real-time audio generation. It provides six distinct, natural-sounding preset voices and supports low-latency streaming, making it ideal for interactive applications. A higher-quality variant, tts-1-hd, is available for tasks where audio fidelity is prioritized over speed.
Command R+
by Cohere
Cohere's most capable RAG-optimized model, offering significantly enhanced reasoning, multi-step tool use, and superior grounded generation over Command R. Designed for complex enterprise workflows requiring high accuracy and citations.
Imagen 3
by Google DeepMind
Google DeepMind's highest-quality text-to-image generation model producing photorealistic images with improved detail, lighting, and fewer artifacts. Features enhanced prompt understanding and safety filtering.
Qwen 2.5 Max
by Alibaba Cloud
Alibaba Cloud's most capable proprietary model in the Qwen 2.5 family, optimized for complex reasoning and enterprise applications. Available exclusively through Alibaba Cloud's Model Studio API with enhanced safety and alignment.
AudioCraft
by Meta AI
AudioCraft is an open-source generative audio framework from Meta AI. It integrates MusicGen for music, AudioGen for sound effects, and the EnCodec codec into a single platform. This unified, modular design allows for text-to-audio generation and has become a key reference for audio LLM research.
LegalBERT
by Ilias Chalkidis et al. (Academic)
LegalBERT is a family of BERT models pre-trained on a diverse corpus of English legal texts, including legislation, court cases, and contracts. This specialized training allows it to significantly outperform general-purpose BERT models on downstream legal NLP tasks, establishing it as a foundational baseline for legal AI research and applications.
Gemma 2 9B
by Google DeepMind
Gemma 2 9B is a lightweight, state-of-the-art open model from Google, part of the next generation of the Gemma family. It offers strong performance for its size class, making it ideal for environments with limited computational resources. Built on a new architecture, it is optimized for on-device applications, research, and fine-tuning.
QwQ-32B
by Alibaba / Qwen Team
QwQ-32B is a 32 billion parameter language model from Alibaba, specifically optimized for complex reasoning tasks. It utilizes a deep chain-of-thought methodology to excel at mathematical, scientific, and logical problems, achieving performance comparable to much larger models and showcasing high parameter efficiency.
BLOOM
by BigScience Workshop
BLOOM is a 176 billion parameter, open-access multilingual language model developed by the BigScience research workshop. Trained on 46 natural languages and 13 programming languages, it provides powerful text and code generation capabilities, making it a key resource for researchers and developers building multilingual AI applications.
StarCoder2 15B
by BigCode (ServiceNow + Hugging Face)
StarCoder2 15B is a powerful open-source code generation model from the BigCode project. Trained on The Stack v2 dataset spanning over 600 programming languages, it excels at code completion, generation, and fill-in-the-middle tasks, emphasizing data transparency and author opt-out.
Phi-3 Mini
by Microsoft
Microsoft's Phi-3 Mini is a 3.8 billion parameter small language model (SLM) designed for high performance on resource-constrained devices. Despite its compact size, it exhibits strong reasoning and language understanding capabilities, making it suitable for on-device and edge AI applications. It is optimized for efficient inference.
Cohere Rerank v3
by Cohere
Cohere Rerank v3 is a state-of-the-art neural model designed to significantly boost the relevance of search results for Retrieval-Augmented Generation (RAG) systems. It re-scores a list of candidate documents from any keyword or vector search system, identifying the most pertinent information. It supports over 100 languages and can process long documents, making it highly versatile.
DeepSeek Coder 33B
by DeepSeek
DeepSeek Coder 33B is a dense, open-source large language model specializing in code-related tasks. Trained from scratch on a massive 2 trillion token dataset of code and natural language, it understands project-level context and supports 87 different programming languages for advanced code generation and completion.
Llama 3.2 11B Vision
by Meta
Llama 3.2 11B Vision is Meta's first open-source multimodal model, integrating native image understanding with advanced text generation. At a compact 11B parameters, it's designed for efficiency, enabling visual question answering, image captioning, and complex reasoning across text and images in a single, deployable model.
DeepSeek-V2
by DeepSeek
DeepSeek's mixture-of-experts model introducing Multi-head Latent Attention (MLA) for dramatically reduced inference cost. Activates 21B of its 236B total parameters per token while matching larger dense models.
Codestral
by Mistral AI
Codestral is Mistral AI's open-weight generative model explicitly designed for code generation tasks. Trained on a diverse dataset of over 80 programming languages, it excels at code completion, generation, and its unique fill-in-the-middle capability. It is optimized for low-latency performance in real-world applications.
Gemma 2 27B
by Google DeepMind
Gemma 2 27B is a powerful, mid-sized open-weights model from Google DeepMind. It delivers significant performance gains in reasoning, coding, and instruction following over smaller variants. Designed for server-side deployment, it provides a strong foundation for advanced research and custom fine-tuning projects.
Claude 4.5 Haiku
by Anthropic
Claude 4.5 Haiku is Anthropic's fastest and most compact model, engineered for near-instant responsiveness and high-throughput workloads. It provides enterprise-grade performance at a fraction of the cost, making it ideal for real-time interactions, content moderation, and cost-effective agentic tasks.
XTTS-v2
by Coqui AI
XTTS-v2 is an open-source, cross-lingual text-to-speech model from Coqui AI. It excels at high-quality voice cloning from just a few seconds of audio and supports 17 languages. With real-time streaming inference, it's ideal for applications needing custom voices and low-latency output.
BloombergGPT
by Bloomberg
BloombergGPT is a 50-billion parameter large language model developed by Bloomberg. It is specifically trained on a massive, curated corpus of financial data accumulated over decades, combined with general-purpose datasets. This specialized training allows it to excel at financial natural language processing tasks, outperforming similarly sized general models.
Grok-2
by xAI
Grok-2 is xAI's second-generation large language model, notable for its real-time knowledge access through the X platform. It possesses strong reasoning and multimodal capabilities, including vision understanding. The model is designed for a more natural, conversational interaction style with a lower tendency to refuse prompts.
BioGPT
by Microsoft Research
BioGPT is a domain-specific language model from Microsoft, pre-trained on a massive corpus of biomedical literature from PubMed. It excels at tasks like generating biomedical text, extracting relationships between entities, and answering questions based on medical research, achieving state-of-the-art results on several benchmarks.
Command R
by Cohere
Command R is a retrieval-optimized language model from Cohere, specifically designed for enterprise-grade Retrieval-Augmented Generation (RAG) and tool use. It excels in multilingual applications, supporting over 10 languages, and features built-in capabilities for grounding responses and generating citations to ensure accuracy.
Gemma 2B
by Google DeepMind
Gemma 2B is Google DeepMind's open-weight 2 billion parameter language model from the Gemma family, designed for lightweight deployment on devices with limited resources. It delivers strong performance for its size on language understanding and generation tasks, and serves as a foundation for fine-tuning on domain-specific tasks.
Pika 1.5
by Pika Labs
Pika 1.5 is an accessible AI video generation model that transforms text prompts or images into high-quality videos. It is known for its expressive motion, diverse cinematic styles, and unique features like physics-based effects and automated lip-sync, making it popular among creators and consumers.