AI for Code

GitHub Copilot + VS Code

by GitHub

GitHub Copilot integrates into VS Code as a first-party extension, delivering inline ghost-text completions, multi-line suggestions, and a dedicated Copilot Chat panel for conversational refactoring, test generation, and documentation. It leverages Codex and GPT-4 models under the hood, with workspace-aware context from open tabs and the current file.

idevscodecode-completion

55C+

Code Review

by AaaS

Analyzes code for bugs, security vulnerabilities, performance issues, and style violations. Provides actionable feedback with severity levels and suggested fixes aligned to language-specific best practices and project conventions.

codingreviewquality

53C+

benchmarkevaluationsoftware-engineering

SWE-bench Verified

by Princeton NLP

Human-validated subset of SWE-bench containing 500 problems verified by software engineers for correctness, clarity, and solvability. Provides a more reliable signal than the full SWE-bench by filtering out ambiguous or under-specified issues.

52C+

codepretrainingpermissive-license

The Stack v2

by BigCode

An expanded code pretraining dataset containing 3 trillion tokens of source code in 619 programming languages, curated by BigCode from GitHub repositories with permissive SPDX licenses. Version 2 triples the size of the original Stack and includes improved deduplication, opt-out mechanisms for authors, and structured data from GitHub issues and pull requests alongside raw source files.

52C+

MBPP (Mostly Basic Python Problems)

by Google

A dataset of 974 crowd-sourced Python programming problems suitable for entry-level programmers, each with a problem description, code solution, and three automated test cases. MBPP complements HumanEval by covering a broader variety of programming concepts and is widely used alongside it for comprehensive evaluation of code generation capabilities across model families.

codeevaluationpython

51C+

Cursor + OpenAI

by Anysphere

Cursor is a VS Code fork that uses OpenAI's GPT-4 and o-series models as its reasoning engine for multi-file edits, semantic codebase search, and an agent mode that can autonomously implement features across the entire repository. It offers a Composer panel for multi-file diffs and a codebase-aware chat that indexes the project with embeddings for precise retrieval.

ideai-editoropenai

51C+

benchmarkevaluationcoding

MBPP

by Google Research

Mostly Basic Programming Problems — a collection of 974 crowd-sourced Python programming tasks with natural language descriptions and test cases. Tests foundational programming ability including string manipulation, list processing, and basic algorithms.

50C+

REST AI API Template

by Community

Production-ready FastAPI template for AI-powered REST APIs, with pre-wired OpenAI/Anthropic client, async streaming endpoints, JWT authentication, rate limiting, structured logging, and OpenAPI docs. Includes Docker Compose stack with Redis rate-limit store and Prometheus metrics.

rest-apifastapiopenai

50C+

test-generationautomated-testingunit-testing

Test Generation

by AaaS

Automates the creation of test suites by analyzing source code, function signatures, or specifications. It generates unit tests, integration tests, and edge case scenarios for popular frameworks, complete with necessary mocks and assertions. This accelerates development cycles and improves code reliability.

49C

StarCoderData

by BigCode

The 780 billion token code dataset used to pretrain the StarCoder family of models, assembled by BigCode from The Stack v1 spanning 86 programming languages with permissive licenses. It includes GitHub issues, Git commits, and Jupyter notebook data alongside source files, enabling models to learn from developer workflows and not just static code.

codepretraininggithub

49C

ai-code-assistantcode-completioncopilot

GitHub Copilot + JetBrains

by GitHub

The GitHub Copilot plugin for JetBrains IDEs integrates AI-powered code completion and a conversational chat panel directly into the editor. It provides inline, ghost-text suggestions and mirrors the functionality of the VS Code extension, adapting to JetBrains' native keymaps and user interface for a seamless experience across IDEs like IntelliJ IDEA and PyCharm.

49C

code-llmopen-weightcode-generation

Qwen 2.5 Coder 32B

by Alibaba Cloud

Qwen 2.5 Coder 32B is an open-weight, code-specialized large language model from Alibaba Cloud. Fine-tuned on a massive corpus covering over 92 programming languages, it excels at code generation, completion, and debugging tasks, demonstrating performance on par with or exceeding proprietary models like GPT-4o on several benchmarks.

48C

benchmarkevaluationcoding

HumanEval+

by BigCode

HumanEval+ is a benchmark for rigorously evaluating code generation models. It augments the original HumanEval dataset by expanding the test suite for each of its 164 problems by 80x. This extensive testing helps uncover subtle bugs and failures on edge cases that simpler benchmarks miss, providing a more accurate measure of a model's true coding ability.

47C

code-generationopen-sourcemoe

DeepSeek-Coder-V2

by DeepSeek

DeepSeek-Coder-V2 is a powerful open-source Mixture-of-Experts (MoE) model specialized in code. It supports 338 programming languages and features advanced fill-in-the-middle capabilities, offering performance comparable to top-tier proprietary models like GPT-4 Turbo at a significantly lower inference cost.

47C

codemultilingual-codegithub

GitHub Code Dataset

by Hugging Face / BigCode

The GitHub Code Dataset is a massive, multilingual collection of source code from public GitHub repositories, spanning 32 programming languages. Distributed via Hugging Face under the BigCode project, it provides a foundational resource for pretraining large language models on diverse code-related tasks, from generation to analysis.

46C

websocketstreamingreal-time

WebSocket Streaming API

by Community

WebSocket server that proxies token-by-token LLM streaming to multiple simultaneous clients, with connection lifecycle management, heartbeat keep-alives, and per-session context persistence. Supports fan-out broadcasting for collaborative AI sessions and reconnection with message replay.

45C

Windsurf + Anthropic

by Codeium

Windsurf (by Codeium) is an AI-native IDE that integrates Anthropic's Claude models as the backbone of its Cascade agent, which autonomously plans and executes multi-step coding tasks with real-time file and terminal access. The Anthropic integration powers deep context awareness across large codebases and supports long-horizon agent tasks with coherent state tracking.

ideai-editoranthropic

45C

Tabnine + VS Code

by Tabnine

Tabnine's VS Code extension provides AI-powered code completions, including whole-line and full-function suggestions. It is designed for enterprises with strict privacy and data-residency needs, offering on-premise or private cloud deployment options. The AI can be trained on a team's specific codebase for highly relevant completions.

idevscodecode-completion

45C

sqldatabasequery-generation

SQL Generation

by AaaS

Converts natural language questions into executable SQL queries against relational databases. Supports schema-aware generation, multi-table joins, aggregations, and query optimization with dialect-specific syntax for PostgreSQL, MySQL, SQLite, and others.

documentationdocsapi-docs

Documentation Generation

by AaaS

Generates technical documentation from source code, including API references, README files, inline comments, and architectural guides. Adapts tone and detail level for different audiences from developer guides to end-user documentation.

coding-agentcode-completionprivacy-focused

Tabnine

by Tabnine

Privacy-first AI code assistant that can run entirely on-premise or in air-gapped environments. Specializes in personalized code completions trained on your team's codebase with zero data retention and full IP protection.

coding-agentresearchopen-source

SWE-agent

by Princeton NLP

Princeton NLP's research agent that turns LLMs into autonomous software engineers. Achieves state-of-the-art results on SWE-bench by providing an agent-computer interface optimized for code navigation and editing.

coding-agentopen-sourcesandboxed

OpenHands

by All Hands AI

OpenHands is an open-source platform for creating autonomous AI software agents. It offers a secure, sandboxed environment where agents can execute complex development tasks by writing code, running commands, browsing the web, and interacting with APIs. It supports multi-agent delegation for tackling intricate problems.

sentiment-analysisdashboardbrand-monitoring

Sentiment Dashboard

by Community

Ingests social media feeds, reviews, and support tickets in near-real-time, scores sentiment at entity and aspect level using a fine-tuned RoBERTa model, and renders a live Streamlit dashboard with trend charts, topic clustering, and configurable alert thresholds for brand-crisis detection.

code-llmopen-sourcecode-generation

StarCoder2 15B

by BigCode (ServiceNow + Hugging Face)

StarCoder2 15B is a powerful open-source code generation model from the BigCode project. Trained on The Stack v2 dataset spanning over 600 programming languages, it excels at code completion, generation, and fill-in-the-middle tasks, emphasizing data transparency and author opt-out.

coding-agentai-ideagentic-workflow

Windsurf

by Codeium

Windsurf is an AI-powered IDE from Codeium built around Cascade, a deep agentic workflow engine. It maintains context across complex, multi-step coding tasks, allowing it to function autonomously. The platform combines the features of a coding copilot with the power of a fully agentic system in a single editor.

43C

code-generationopen-sourcedense-model

DeepSeek Coder 33B

by DeepSeek

DeepSeek Coder 33B is a dense, open-source large language model specializing in code-related tasks. Trained from scratch on a massive 2 trillion token dataset of code and natural language, it understands project-level context and supports 87 different programming languages for advanced code generation and completion.

43C

temporal-featurestime-seriesrolling-windows

Temporal Feature Builder

by Community

Generates comprehensive temporal features from time-series data including rolling statistics, lag features, Fourier transforms, and calendar encodings using tsfresh and custom transformers. Handles irregular time series with forward-fill interpolation and produces a point-in-time-correct feature matrix to prevent leakage.

42C

benchmarkevaluationmachine-learning

MLE-bench

by OpenAI

Benchmark evaluating AI agents on real Kaggle machine learning competitions. Tests the full ML engineering pipeline including data exploration, feature engineering, model selection, training, and submission formatting against actual competition leaderboards.

41C

benchmarkevaluationcompetitive-programming

Codeforces Benchmark

by Codeforces / Community

Evaluates models on competitive programming problems from the Codeforces platform across difficulty ratings. Tests algorithmic thinking, data structure knowledge, and the ability to produce correct and efficient solutions under competitive constraints.

41C

coding-agentgithub-botautomated-pr

Sweep AI

by Sweep AI

AI-powered GitHub bot that transforms issues into pull requests by reading the codebase, planning changes, and writing code. Acts as a junior developer handling routine tasks like bug fixes and small features.

39D

translationmigrationcross-language

Code Translation

by AaaS

Translates source code between programming languages while preserving logic, idioms, and patterns. Handles framework-specific migrations, API mappings, and ecosystem-specific conventions for accurate cross-language porting.

38D

supply-chainoptimizationor-tools

Supply Chain Optimizer

by Community

Combines ML demand forecasting (Prophet + LightGBM) with constraint-based optimization (Google OR-Tools) to minimize inventory costs while meeting service-level targets across a multi-echelon supply chain. Outputs replenishment orders, safety stock recommendations, and a scenario simulation dashboard.

37D

code-generationopen-sourceinstruction-tuned

WizardCoder 33B

by WizardLM Team

WizardLM's code-focused model fine-tuned from StarCoder using Evol-Instruct methodology for complex code generation. Achieves strong performance on HumanEval by evolving instruction complexity during training.

30D

incident managementAIOpsroot cause analysis

CausaLayer

by CausaLayer

An AI-driven platform for incident management and root cause analysis in complex systems. It uses machine learning to identify the true cause of incidents faster, reducing downtime.

18F

api-developmentapi-testingapi-design

Apidog

by Apidog, Inc.

An all-in-one platform for API development, integrating design, debugging, testing, and mocking functionalities with AI assistance. It streamlines the entire API lifecycle, from conceptualization to deployment and maintenance.

18F

unit testingtest generationcode quality

Symflower

by Symflower GmbH

An AI-powered tool that automatically generates unit tests for your Go, Java, and PHP code. It helps ensure code quality and reduces the manual effort of writing comprehensive tests.

16F

Harness (AI features)

by Harness Inc.

Harness is a CI/CD and software delivery platform that incorporates AI features for intelligent deployments, anomaly detection, and cost optimization. Its AI capabilities help automate and improve the reliability and efficiency of software releases.

ci-cddevopsaiops