Architecture // Storage Engine
Distributed SQL, shared-nothing, peer-to-peer architecture. All nodes symmetrical; any node can handle reads/writes. Cluster uses distributed consensus: No matter where data lives, every node can access data anywhere in cluster // Built on Pebble, a highly tuned, GO-based, LSM-tree key-value storage engine developed by Cockroach Labs and inspired by RocksDB specifically for distributed SQL
Disaggregated multi-component: TiDB (stateless SQL compute), TiKV (distributed row storage via Raft/RocksDB), PD (Placement Driver for metadata and global timestamp allocation), and TiFlash (columnar engine for analytics). Not peer-to-peer: each component has a distinct role. PD is a centralized timestamp source in the critical path of all distributed transactions // Uses RocksDB for transactional data with Raft replication. TiFlash uses proprietary Delta-tree columnar engine for analytical reads. Two separate storage engines for HTAP: TiFlash data is asynchronously replicated from TiKV as a Raft Learner, meaning analytical reads may lag behind transactional writes
Ideal Workloads
SYSTEM OF RECORD. Optimized for transactional workloads that require strong consistency and global distribution, such as AI innovators, cybersecurity, eCommerce & retail, financial services, fintech/payments, gaming, quant/trading & research, and online travel
MYSQL MIGRATIONS/SCALE-OUT. MySQL-compatible HTAP workloads combining OLTP and real-time analytics for fintech, e-commerce, gaming, and SaaS applications. Best for teams migrating from MySQL that need distributed scale-out with an optional analytics path via TiFlash, and can accept Snapshot Isolation as the default consistency model
Auto-Sharding (Dynamic Re-Sharding Online)
NATIVE & AUTOMATIC. Automatically shards data into ranges and dynamically splits, merges, and rebalances online across nodes based on load and size. Zero downtime, fully transparent
NATIVE & AUTOMATIC. Range-based region splitting and rebalancing via PD. PD schedules region splits, merges, and load balancing across TiKV nodes transparently. Requires PD health and availability for scheduling decisions; PD is in the critical path of all rebalancing operations
Automatic Geo-Partitioning (Multi-Region Data Affinity)
NATIVE AND AUTOMATIC. Declarative SQL schema adjustments automatically route, partition, and anchor data close to the user's location, automatically moving data to the region where it is most frequently accessed. Supports geo-partitioning with zone configurations for data locality, compliance, and low latency
NATIVE. Placement Rules allow table-level and partition-level data placement to specific regions or availability zones. Row-level placement control requires table partitioning first, then Placement Rules on partitions. Less declarative than CockroachDB's REGIONAL BY ROW; no automatic affinity routing based on user proximity; requires explicit rule definitions and partition schema design
Availability including Multi-Cloud and Hybrid
Available on all public clouds, e.g., AWS-Google Cloud-Azure; can run a single logical cluster spanning multiple clouds. Can run on prem/local, and cloud + prem hybrid deployments
Available on AWS, GCP, and Azure via TiDB Cloud (managed service). TiDB Cloud is single-cloud per cluster; multi-cloud configurations require self-managed deployments. Self-hosted deployment on any cloud, on-premises, or Kubernetes via the open-source release. Hybrid configurations supported
Change Data Capture (CDC)
NATIVE. CHANGEFEED command enables scalable, resilient streaming of data changes to Kafka, cloud storage, and webhooks; no third-party CDC tool needed. CDC Queries enable SQL-based filtering and transformation of streams
NATIVE. TiCDC is TiDB's native CDC component, streaming row-level change events to Kafka, MySQL-compatible targets, and cloud storage. Deployed as a separate TiCDC service alongside the cluster, which adds an additional component to monitor, keep version-compatible, and manage independently from the main TiDB cluster upgrade lifecycle
Cluster Sizes // Scale
SINGLE BINARY, ALL ROLES ON EVERY NODE. No separate compute, storage, or metadata layers. Minimum 3 nodes for HA (one per availability zone), scales linearly to hundreds or thousands by simply adding nodes. Development and testing run on a single node. Minimum production configuration on CockroachDB Advanced: 3 nodes × 4 vCPUs = 12 vCPUs // Virtually unlimited horizontal scale-out simply by adding more nodes
Minimum HA production cluster requires separate TiDB server nodes, TiKV storage nodes (minimum 3 across 3 AZs), and PD metadata nodes (minimum 3); typically 8+ machines before adding TiFlash for HTAP // TiDB compute and TiKV storage scale independently by adding nodes; TiFlash nodes scale the analytics layer separately. Automatic region splitting and rebalancing via PD—but PD as centralized timestamp allocator can become a bottleneck at extreme global scale
Data Anomalies
ZERO under Serializable isolation: all standard SQL anomalies (dirty reads, non-repeatable reads, phantom reads, lost updates, and write skew) are prevented by default with no additional developer configuration. Read Committed is also available for workloads where some consistency relaxation is an acceptable tradeoff for reduced latency
SUSCEPTIBLE to phantom reads and write skew anomalies. Snapshot Isolation by default allows write skew: two concurrent transactions can each read a consistent snapshot, make non-conflicting writes, and both commit, leaving the database in a logically inconsistent state. Applications requiring strict serializability must use explicit SELECT FOR UPDATE locking; write skew is not prevented by the default isolation level
Data Integrity & Foreign Keys Support
Provides strict ACID enforcement at the storage layer and full referential integrity. NATIVELY VALIDATES FOREIGN KEYS, explicit CHECKs, and transactional constraints to ensure absolute global correctness across global nodes
Foreign key constraints enforced since v6.6.0, production-supported in v8.5.0 (2025). Standard ON DELETE/UPDATE cascades are supported. Foreign key enforcement uses exclusive locks on parent rows, which can create lock contention when many concurrent transactions reference high-traffic parent records. Triggers are not supported; CHECK constraints are parsed but not enforced by default
Data Model Complexity
LOW. Relational model with strict schemas, normalized tables, joins, and referential integrity. Ideal for managing complex relationships and transactional systems of record; adapts easily to microservices and enterprise legacy systems
MODERATE. Relational model with MySQL-compatible schemas and supports JSON natively. TiFlash provides a columnar analytical replica of the same data without ETL, but enabling it requires explicit per-table configuration and is not automatic. Two separate storage engines (TiKV for OLTP, TiFlash for OLAP) add operational complexity not present in purely relational systems
Data Residency
STRONG, INTUITIVE, AND LOCALITY-AWARE. Helps fulfill compliance (e.g., GDPR, CCPA) with Row-Level Control: can pin specific rows to specific geographic regions using REGIONAL BY ROW command, while preserving single logical data platform. Business and compliance teams can use simple SQL commands to ensure customer data never leaves specific geographic borders
SUPPORTED; LOCALITY-AWARE. Placement Rules provide region-level data placement; tables or partitions can be pinned to specific geographic zones. Row-level residency requires partitioning the table by a relevant key first, then applying Placement Rules per partition. Less seamless than CockroachDB's native REGIONAL BY ROW; compliance use cases require careful upfront schema and partition design
Developer Tools // Developer Experience // Ease of Use
Rich ecosystem: Local CLI, web console UI, ORMS, BI tools, SQL clients, native DB migration toolkits, language‑specific drivers, and compatibility with standard PostgreSQL developer tools like psql // PostgreSQL wire protocol-compatible; feels exactly like developing on standard PostgreSQL. Fits effortlessly into existing ORMs, drivers, and frameworks // Can be spun up instantly in any environment (AWS, GCP, on-prem) with the exact same management interface. The cluster manages its own data balancing, scaling, and hardware survival automatically; DBAs do not need to be distributed systems experts to keep it running smoothly.
TiUP, TiDB Dashboard, TiCDC, DM, BR, Dumpling, and standard MySQL-compatible clients. Open-source toolchain; each tool is a separate component requiring version management // MySQL-compatible. Teams migrating from MySQL have a low-friction path; PostgreSQL teams require driver and syntax adaptation // TiUP simplifies cluster deployment and upgrades. TiDB-TiKV-PD-TiFlash achitecture requires understanding each layer independently in self-hosted environments. TiDB Cloud reduces operational overhead. Enabling TiFlash for HTAP requires explicit per-table replication configuration
Distributed ACID Transactions
Fully distributed, multi-row, multi-table ACID transactions out-of-the-box. Fully supported with serializable isolation using distributed consensus (Raft Protocol) across tables, ranges, and regions; strong ACID guarantees
Distributed ACID transactions across TiKV nodes via Percolator-based two-phase commit. Snapshot Isolation is the default isolation level. Cross-shard transactions are supported but carry higher latency than single-region operations. PD's global timestamp service is in the critical path of all distributed transactions, creating a centralized dependency that constrains global scale-out
Enterprise Support
Dedicated 24/7/365 enterprise support directly from Cockroach Labs with strict SLAs and custom engineering channels. Offers global follow-the-sun support (TSE+SRE) with proven reliability and global partnerships with industry leaders. Single Global Incident Management integrates Engineering + Support + Customer Success in one channel for consistency/immediacy
Commercial support available for PingCAP for TiDB Enterprise users; TiDB Cloud includes managed support tiers. Open-source community support for self-managed deployments
FinOps Support
HIGH. Straightforward pricing based on predictable node usage or consumption metrics. Avoids hidden, fluctuating network traps when moving data across different infrastructure regions. Supports financial governance/FinOps
MODERATE TO HIGH. Core TiDB is open source (Apache 2.0). TiDB Cloud pricing bills TiDB compute, TiKV storage, and TiFlash nodes separately. Adding TiFlash for HTAP increases cost above a pure OLTP cluster. Enterprise edition required for LDAP and advanced security
Follower Reads
SUPPORTED. Supports follower/replica reads with Bounded (controlled) Staleness, allowing low‑latency local reads from nearby replicas while keeping strong global ordering
CONFIGURED AT APP LAYER. Stale Read (follower reads) allows TiKV replica nodes to serve read requests at a configurable staleness level (AS OF TIMESTAMP or tidb_read_staleness). Reduces cross-region read latency by serving from the nearest replica. Reads are not guaranteed current; staleness bound must be configured by the application; not automatic like CockroachDB's bounded staleness
FREEDOM
ZERO VENDOR LOCK-IN. Runs on any public or private cloud, across multiple clouds, via CockroachDB's Bring Your Own Cloud (BYOC) offering, on-premises, bare metal, Kubernetes, self-hosted, or in a hybrid deployment encompassing some or all of these. Business Source License (BSL) but Source Available. Full commercial-grade support directly from CockroachDB
ZERO VENDOR LOCK-IN. Open source core (TiDB and TiKV under Apache 2.0; TiKV is a CNCF project). Self-managed on any cloud, on-premises, or bare metal. MySQL wire protocol compatibility eases migration from MySQL. TiDB Cloud is cloud-specific per cluster; self-managed provides full deployment flexibility. Strong open-source community, originally anchored in Asia-Pacific
Joins
Executes fully distributed hash joins, merge joins, and lookup joins across arbitrary nodes with CockroachDB's advanced Cost-based Optimizer. Full standard SQL support for complex INNER, OUTER, LEFT, RIGHT joins across distributed tables
Full SQL join support with a distributed query optimizer. Hash joins, merge joins, and index joins across TiKV regions. The optimizer can push analytical joins down to TiFlash for columnar execution. Cross-region join performance depends on data co-location; the optimizer routes work to reduce data movement but cannot eliminate cross-region round-trips when data is not co-located
LDAP Support
NATIVE. Direct native support for external authentication systems like LDAP, Active Directory, GSSAPI, and OIDC
NATIVE. LDAP authentication supported in TiDB Enterprise edition. Active Directory and external identity provider integration available in the commercial edition. The open-source edition does not include enterprise authentication features natively; LDAP requires a commercial Enterprise license
Migrations
Uses MOLT (Migration Off Legacy Technology) Toolkit & change data capture (CDC): MOLT handles schema conversion/verification and CDC moves data out. PostgreSQL wire protocol compatibility enables lift-and-shift; shadow mode testing
No direct equivalent to CockroachDB's MOLT toolkit for heterogeneous schema verification and shadow mode testing. DM (Data Migration) handles MySQL-to-TiDB migration with schema conversion and continuous replication; well-suited for MySQL-origin workloads. Dumpling exports data; BR handles backup and restore. Migrations from non-MySQL databases require SQL dialect conversion and schema adaptation
Multi-Active
YES: FULLY MULTI-ACTIVE/MULTI-REGION; read/write and handle connection requests from any node in the cluster. All nodes are equal and active; any node can accept read and write traffic simultaneously.
YES—WITHIN CLUSTER. All TiDB server nodes accept reads and writes simultaneously. TiKV region leaders handle writes for their respective data ranges; followers handle reads in Stale Read mode. Cross-region active-active is constrained by Raft consensus latency and PD timestamp centralization, limiting locality-first global write performance
Multi-Data-Center Support
FULL. Connects geographically isolated, heterogeneous data centers (AWS, GCP, Azure, on-prem) into a single logical cluster, supported by features such as Physical Cluster Replication (PCR) and Logical Data Replication (LDR)
FULL BUT CAN IMPACT PERFORMANCE. Supported via multi-AZ and multi-datacenter TiKV replica placement configured through Placement Rules. TiDB Cloud supports regional HA within a single cloud. Self-hosted deployments can span multiple datacenters. Minimum 3 TiKV replicas required across AZs for zone failure survival; cross-datacenter Raft consensus latency affects transaction performance
Multi-region Functionality // Multi-region Writes
ACTIVE-ACTIVE: Read/Write from any node in any region; built-in low-latency local access patterns and Survival Goals (e.g., ALTER DATABASE...SURVIVE REGION FAILURE) commands configure fault tolerance intent // True multi‑region, multi‑active writes: any node in any region can serve reads and writes while preserving serializable consistency guarantees
Multi-region deployments supported via Placement Rules for data distribution across regions. Any TiDB node can accept writes. Cross-region write latency reflects Raft consensus round-trips to quorum. PD-based centralized timestamp allocation adds overhead for global transactions, making TiDB less optimized for pure multi-region global write patterns than CockroachDB's peer-to-peer model // Any TiDB node can accept writes across regions. Placement Rules control where region leaders reside. Multi-region write latency is driven by Raft consensus across replicas plus PD timestamp allocation round-trips. Multi-region configuration is less declarative than CockroachDB's Survival Goals and ALTER DATABASE commands
Replication
NATIVE AND SIMPLIFIED. Built-in, automatic consensus replication using the Raft protocol; data is divided into ranges and replicated across nodes
NATIVE AND SPLIT ACROSS THREE REPLICATION SYSTEMS: Raft consensus within TiKV for transactional HA (3 replicas by default), asynchronous Raft Learner replication from TiKV to TiFlash for analytics, and TiCDC for logical replication to downstream systems. Each replication path requires independent monitoring, version management, and operational attention
Required Downtime
ZERO. Online schema changes, rolling upgrades, and cluster expansion occur without taking the data platform offline
ZERO BUT WITH PERFORMANCE IMPACTS. Online DDL via TiDB's distributed DDL framework; most schema changes (add/drop column, add index) run without blocking reads or writes. Multi-stage DDL propagation across TiKV regions takes longer on large tables than single-node DDL. Rolling upgrades of multi-component clusters (TiDB, TiKV, PD, TiFlash) require component-by-component updates and take longer than single-binary rolling upgrades
Resilience
Five 9s availability: Survives node/disk/rack/region failures automatically via Raft consensus, with zero data loss (RPO=0). Naturally resilient to outages with granular row-level control
Raft consensus replication in TiKV provides automatic failover with 3 replicas by default. Supports zone and datacenter failure survival when replicas span 3+ locations. TiFlash uses asynchronous Raft Learner replication, meaning analytical replicas may lag during recovery. Multi-component failures (PD, TiKV, TiDB) each require independent management
Schema Changes
FULLY ONLINE & NON-BLOCKING. Online transactional schema changes (add/alter columns, indexes, constraints) run in the background without locking tables with zero downtime. Designed for always‑on services
FULLY ONLINE WITH SOME LOCKING CONFLICTS. Online DDL supported for most common operations. Multi-stage DDL execution propagates changes across all TiKV regions, which takes time on large tables. Schema lock conflicts can arise during concurrent DDL operations
Security-Privacy-Compliance
RBAC, Encryption at Rest with Customer Managed Encryption Keys (CMEK), TLS encryption in transit, IAM integrations, column-level encryption, and robust data-masking natively. Fine-grained encryption at cluster, database, table, or partition levels. Certified SOC 2 Type II and SOC 3, PCI-DSS, HIPAA, and ISO 27001-27017-27018 compliant, with ISO 42001 (Responsible, Ethical, and Safe AI Governance) pending. CockroachDB CIS Benchmarks to deploy hardened CockroachDB configurations. Comprehensive support for GDPR and DORA compliance
RBAC, encryption at rest (TDE), TLS in transit, and audit logging. LDAP and Active Directory integration available in TiDB Enterprise edition only. TiDB Cloud supports CMEK. Placement Rules provide region-level data placement for residency compliance but require table partitioning for row-level control. SOC 2 Type 2 certified
SQL Compatibility
HIGH. PostgreSQL Wire Compatible: Uses PG wire protocol; strong ANSI SQL with complex queries, joins, window functions, triggers, stored procedures, and UDFs. Supports spatial data, extensions, syntax; most apps connect with minimal or no changes
HIGH. MySQL 8.0 compatible; MySQL wire protocol, MySQL syntax, and most MySQL features work with minimal or no code changes. Not PostgreSQL-compatible. Triggers and stored procedures are not supported, which is a significant gap for MySQL estates relying on procedural database logic. Some distributed execution behaviors differ from single-node MySQL semantics
Stored Procedures
SUPPORTED AND MATURE. PL/pgSQL and other languages such as Python and Perl support deep procedural logic, autonomous transactions, and complex business rule enforcement. Supports user-defined stored procedures
NOT SUPPORTED. TiDB explicitly does not support stored procedures or user-defined functions. Applications relying on MySQL stored procedures must move all procedural logic to the application layer before migrating to TiDB. This is a documented MySQL compatibility gap with meaningful migration impact for procedure-heavy schemas
Transaction Performance // Isolation Levels
Optimized for OLTP with strong consistency; cross‑region transactions maintain data correctness. Optimizations like Parallel Commits drop distributed execution overhead to a single network round-trip for most transactions // Enforces strict Serializable isolation exclusively, the strongest isolation level, to ensure zero data anomalies under heavy parallel traffic, and Read Committed
OLTP performance is strong for single-region operations. Pessimistic locking (default) reduces conflict aborts vs. optimistic mode but adds per-row lock overhead. Cross-region transactions incur Raft consensus latency. PD-based timestamp allocation adds a network round-trip to the start of each distributed transaction // Snapshot Isolation (default) and Read Committed. Serializable isolation is not the default and requires explicit pessimistic locking with SELECT FOR UPDATE. Unlike CockroachDB, which enforces Serializable by default, TiDB's default leaves write skew anomalies possible without additional developer awareness and explicit application-level locking
Triggers & Deferrable Constraints
FULLY SUPPORTED. Supports triggers and deferrable constraints across all deployment models
NOT SUPPORTED. TiDB explicitly does not support triggers. Deferrable constraints are also not supported. Applications relying on database-level triggers or deferred constraint enforcement must move that logic to the application tier. This is a significant compatibility gap for MySQL estates with trigger-heavy schemas
Vector Search
BUILT-IN NATIVE VECTOR SEARCH, scalable distributed HNSW/IVF indexing, and pgvector (the industry standard for vector similarity search). CockroachDB's C-SPANN provides distributed vector indexing (ANN) at scale; available across all tiers. Suited for AI/ML inference and RAG applications where vectors and transactional data coexist in one engine without a separate vector database
BUILT-IN BUT EXPERIMENTAL. Vector search added in TiDB v8.4.0 but still marked experimental, not recommended for production. Requires TiFlash replicas for HNSW approximate nearest neighbor indexing. Supports up to 16,383 dimensions; single-precision floats only. Vector indexes cannot serve as primary keys or unique indexes
Writes and Query Routing
Every node is a gateway to the entirety of the database for unlimited reads and writes in any region. Any node can accept SQL queries; a Distributed Optimizer routes work to the right ranges/replicas based on locality and cost
Any stateless TiDB server node accepts SQL queries. Reads and writes are routed to the appropriate TiKV region leaders via PD metadata. Reads can be served from TiKV (strongly consistent) or TiFlash (columnar, asynchronously replicated). PD Placement Driver mediates all data location decisions and is in the routing path for distributed queries
PRICING
SIMPLE. Commercial Enterprise: Simple, straightforward pricing, plus the ability to tie data to a location to avoid egress costs. Free for single-node/dev. Free Community Tier
SIMPLE TO COMPLEX. Core TiDB is open source (Apache 2.0) and free to self-manage. TiDB Cloud bills TiDB compute, TiKV storage, and TiFlash nodes separately with consumption-based pricing. A free Starter tier is available. TiDB Enterprise commercial licensing is required for LDAP, advanced security, and enterprise support. Adding TiFlash increases cost above a pure OLTP deployment