AI for Code
Code generation, review, debugging, and developer tools
40 entities indexed
GitHub Copilot + VS Code
by GitHub
GitHub Copilot integrates into VS Code as a first-party extension, delivering inline ghost-text completions, multi-line suggestions, and a dedicated Copilot Chat panel for conversational refactoring, test generation, and documentation. It leverages Codex and GPT-4 models under the hood, with workspace-aware context from open tabs and the current file.
HumanEval Dataset
by OpenAI
A curated set of 164 handwritten Python programming problems released by OpenAI, each consisting of a function signature, docstring, reference solution, and unit tests. HumanEval introduced the pass@k metric for functional code correctness evaluation and has become the de facto standard benchmark reported in virtually every code generation model paper.
Code Review
by AaaS
Analyzes code for bugs, security vulnerabilities, performance issues, and style violations. Provides actionable feedback with severity levels and suggested fixes aligned to language-specific best practices and project conventions.
SWE-bench Verified
by Princeton NLP
Human-validated subset of SWE-bench containing 500 problems verified by software engineers for correctness, clarity, and solvability. Provides a more reliable signal than the full SWE-bench by filtering out ambiguous or under-specified issues.
The Stack v2
by BigCode
An expanded code pretraining dataset containing 3 trillion tokens of source code in 619 programming languages, curated by BigCode from GitHub repositories with permissive SPDX licenses. Version 2 triples the size of the original Stack and includes improved deduplication, opt-out mechanisms for authors, and structured data from GitHub issues and pull requests alongside raw source files.
Cursor + OpenAI
by Anysphere
Cursor is a VS Code fork that uses OpenAI's GPT-4 and o-series models as its reasoning engine for multi-file edits, semantic codebase search, and an agent mode that can autonomously implement features across the entire repository. It offers a Composer panel for multi-file diffs and a codebase-aware chat that indexes the project with embeddings for precise retrieval.
MBPP (Mostly Basic Python Problems)
by Google
A dataset of 974 crowd-sourced Python programming problems suitable for entry-level programmers, each with a problem description, code solution, and three automated test cases. MBPP complements HumanEval by covering a broader variety of programming concepts and is widely used alongside it for comprehensive evaluation of code generation capabilities across model families.
REST AI API Template
by Community
Production-ready FastAPI template for AI-powered REST APIs, with pre-wired OpenAI/Anthropic client, async streaming endpoints, JWT authentication, rate limiting, structured logging, and OpenAPI docs. Includes Docker Compose stack with Redis rate-limit store and Prometheus metrics.
MBPP
by Google Research
Mostly Basic Programming Problems — a collection of 974 crowd-sourced Python programming tasks with natural language descriptions and test cases. Tests foundational programming ability including string manipulation, list processing, and basic algorithms.
Test Generation
by AaaS
Automates the creation of test suites by analyzing source code, function signatures, or specifications. It generates unit tests, integration tests, and edge case scenarios for popular frameworks, complete with necessary mocks and assertions. This accelerates development cycles and improves code reliability.
GitHub Copilot + JetBrains
by GitHub
The GitHub Copilot plugin for JetBrains IDEs integrates AI-powered code completion and a conversational chat panel directly into the editor. It provides inline, ghost-text suggestions and mirrors the functionality of the VS Code extension, adapting to JetBrains' native keymaps and user interface for a seamless experience across IDEs like IntelliJ IDEA and PyCharm.
StarCoderData
by BigCode
The 780 billion token code dataset used to pretrain the StarCoder family of models, assembled by BigCode from The Stack v1 spanning 86 programming languages with permissive licenses. It includes GitHub issues, Git commits, and Jupyter notebook data alongside source files, enabling models to learn from developer workflows and not just static code.
Qwen 2.5 Coder 32B
by Alibaba Cloud
Qwen 2.5 Coder 32B is an open-weight, code-specialized large language model from Alibaba Cloud. Fine-tuned on a massive corpus covering over 92 programming languages, it excels at code generation, completion, and debugging tasks, demonstrating performance on par with or exceeding proprietary models like GPT-4o on several benchmarks.
DeepSeek-Coder-V2
by DeepSeek
DeepSeek-Coder-V2 is a powerful open-source Mixture-of-Experts (MoE) model specialized in code. It supports 338 programming languages and features advanced fill-in-the-middle capabilities, offering performance comparable to top-tier proprietary models like GPT-4 Turbo at a significantly lower inference cost.
HumanEval+
by BigCode
HumanEval+ is a benchmark for rigorously evaluating code generation models. It augments the original HumanEval dataset by expanding the test suite for each of its 164 problems by 80x. This extensive testing helps uncover subtle bugs and failures on edge cases that simpler benchmarks miss, providing a more accurate measure of a model's true coding ability.
GitHub Code Dataset
by Hugging Face / BigCode
The GitHub Code Dataset is a massive, multilingual collection of source code from public GitHub repositories, spanning 32 programming languages. Distributed via Hugging Face under the BigCode project, it provides a foundational resource for pretraining large language models on diverse code-related tasks, from generation to analysis.
WebSocket Streaming API
by Community
WebSocket server that proxies token-by-token LLM streaming to multiple simultaneous clients, with connection lifecycle management, heartbeat keep-alives, and per-session context persistence. Supports fan-out broadcasting for collaborative AI sessions and reconnection with message replay.
Windsurf + Anthropic
by Codeium
Windsurf (by Codeium) is an AI-native IDE that integrates Anthropic's Claude models as the backbone of its Cascade agent, which autonomously plans and executes multi-step coding tasks with real-time file and terminal access. The Anthropic integration powers deep context awareness across large codebases and supports long-horizon agent tasks with coherent state tracking.
Tabnine + VS Code
by Tabnine
Tabnine's VS Code extension provides AI-powered code completions, including whole-line and full-function suggestions. It is designed for enterprises with strict privacy and data-residency needs, offering on-premise or private cloud deployment options. The AI can be trained on a team's specific codebase for highly relevant completions.
Sentiment Dashboard
by Community
Ingests social media feeds, reviews, and support tickets in near-real-time, scores sentiment at entity and aspect level using a fine-tuned RoBERTa model, and renders a live Streamlit dashboard with trend charts, topic clustering, and configurable alert thresholds for brand-crisis detection.
Tabnine
by Tabnine
Privacy-first AI code assistant that can run entirely on-premise or in air-gapped environments. Specializes in personalized code completions trained on your team's codebase with zero data retention and full IP protection.
SWE-agent
by Princeton NLP
Princeton NLP's research agent that turns LLMs into autonomous software engineers. Achieves state-of-the-art results on SWE-bench by providing an agent-computer interface optimized for code navigation and editing.
OpenHands
by All Hands AI
OpenHands is an open-source platform for creating autonomous AI software agents. It offers a secure, sandboxed environment where agents can execute complex development tasks by writing code, running commands, browsing the web, and interacting with APIs. It supports multi-agent delegation for tackling intricate problems.
SQL Generation
by AaaS
Converts natural language questions into executable SQL queries against relational databases. Supports schema-aware generation, multi-table joins, aggregations, and query optimization with dialect-specific syntax for PostgreSQL, MySQL, SQLite, and others.
Documentation Generation
by AaaS
Generates technical documentation from source code, including API references, README files, inline comments, and architectural guides. Adapts tone and detail level for different audiences from developer guides to end-user documentation.
StarCoder2 15B
by BigCode (ServiceNow + Hugging Face)
StarCoder2 15B is a powerful open-source code generation model from the BigCode project. Trained on The Stack v2 dataset spanning over 600 programming languages, it excels at code completion, generation, and fill-in-the-middle tasks, emphasizing data transparency and author opt-out.
Windsurf
by Codeium
Windsurf is an AI-powered IDE from Codeium built around Cascade, a deep agentic workflow engine. It maintains context across complex, multi-step coding tasks, allowing it to function autonomously. The platform combines the features of a coding copilot with the power of a fully agentic system in a single editor.
DeepSeek Coder 33B
by DeepSeek
DeepSeek Coder 33B is a dense, open-source large language model specializing in code-related tasks. Trained from scratch on a massive 2 trillion token dataset of code and natural language, it understands project-level context and supports 87 different programming languages for advanced code generation and completion.
Temporal Feature Builder
by Community
Generates comprehensive temporal features from time-series data including rolling statistics, lag features, Fourier transforms, and calendar encodings using tsfresh and custom transformers. Handles irregular time series with forward-fill interpolation and produces a point-in-time-correct feature matrix to prevent leakage.
MLE-bench
by OpenAI
Benchmark evaluating AI agents on real Kaggle machine learning competitions. Tests the full ML engineering pipeline including data exploration, feature engineering, model selection, training, and submission formatting against actual competition leaderboards.
Codeforces Benchmark
by Codeforces / Community
Evaluates models on competitive programming problems from the Codeforces platform across difficulty ratings. Tests algorithmic thinking, data structure knowledge, and the ability to produce correct and efficient solutions under competitive constraints.
Sweep AI
by Sweep AI
AI-powered GitHub bot that transforms issues into pull requests by reading the codebase, planning changes, and writing code. Acts as a junior developer handling routine tasks like bug fixes and small features.
Code Translation
by AaaS
Translates source code between programming languages while preserving logic, idioms, and patterns. Handles framework-specific migrations, API mappings, and ecosystem-specific conventions for accurate cross-language porting.
Supply Chain Optimizer
by Community
Combines ML demand forecasting (Prophet + LightGBM) with constraint-based optimization (Google OR-Tools) to minimize inventory costs while meeting service-level targets across a multi-echelon supply chain. Outputs replenishment orders, safety stock recommendations, and a scenario simulation dashboard.
WizardCoder 33B
by WizardLM Team
WizardLM's code-focused model fine-tuned from StarCoder using Evol-Instruct methodology for complex code generation. Achieves strong performance on HumanEval by evolving instruction complexity during training.
CausaLayer
by CausaLayer
An AI-driven platform for incident management and root cause analysis in complex systems. It uses machine learning to identify the true cause of incidents faster, reducing downtime.
Apidog
by Apidog, Inc.
An all-in-one platform for API development, integrating design, debugging, testing, and mocking functionalities with AI assistance. It streamlines the entire API lifecycle, from conceptualization to deployment and maintenance.
Symflower
by Symflower GmbH
An AI-powered tool that automatically generates unit tests for your Go, Java, and PHP code. It helps ensure code quality and reduces the manual effort of writing comprehensive tests.
Harness (AI features)
by Harness Inc.
Harness is a CI/CD and software delivery platform that incorporates AI features for intelligent deployments, anomaly detection, and cost optimization. Its AI capabilities help automate and improve the reliability and efficiency of software releases.
WhatTheDiff
by WhatTheDiff
WhatTheDiff uses AI to automatically generate clear and concise summaries for your pull requests. It helps teams understand changes faster, making code reviews more efficient and less time-consuming.