Architecture // Storage Engine
Distributed SQL, shared-nothing, peer-to-peer architecture. All nodes symmetrical; any node can handle reads/writes. Cluster uses distributed consensus: No matter where data lives, every node can access data anywhere in cluster // Built on Pebble, a highly tuned, GO-based, LSM-tree key-value storage engine developed by Cockroach Labs and inspired by RocksDB specifically for distributed SQL
Distributed HTAP with a two-tier aggregator/leaf node design. Aggregator nodes handle query coordination; leaf nodes store and process partitioned data. Not peer-to-peer: aggregators and leaves have distinct roles, making the aggregator layer a coordination dependency. Requires upfront choice of storage type per table and extensive memory capacity planning // Proprietary, home-grown Universal Storage (hybrid rowstore + columnstore)
Ideal Workloads
SYSTEM OF RECORD. Optimized for transactional workloads that require strong consistency and global distribution, such as AI innovators, cybersecurity, eCommerce & retail, financial services, fintech/payments, gaming, quant/trading & research, and online travel
REAL-TIME ANALYTICS. Optimized for real-time analytics running alongside transactional operations: real-time dashboards, fraud detection, personalization, IoT event processing, and AI/ML inference. Best suited for workloads where sub-second analytical queries on live data matter more than data correctness and transactional guarantees
Auto-Sharding (Dynamic Re-Sharding Online)
NATIVE & AUTOMATIC. Automatically shards data into ranges and dynamically splits, merges, and rebalances online across nodes based on load and size. Zero downtime, fully transparent
PARTITION-BASED. Requires shard keys be defined at table creation. Automatic rebalancing of partitions across nodes is supported when nodes are added. Changing shard keys after initial table creation typically requires data migration
Automatic Geo-Partitioning (Multi-Region Data Affinity)
NATIVE AND AUTOMATIC. Declarative SQL schema adjustments automatically route, partition, and anchor data close to the user's location, automatically moving data to the region where it is most frequently accessed. Supports geo-partitioning with zone configurations for data locality, compliance, and low latency
NOT SUPPORTED. Not available as a native declarative feature. Data placement is controlled by shard key design and manual deployment topology decisions. No equivalent to CockroachDB's REGIONAL BY ROW, Super Regions, or automatic data affinity routing based on user location or regulatory compliance requirements
Availability including Multi-Cloud and Hybrid
Available on all public clouds, e.g., AWS-Google Cloud-Azure; can run a single logical cluster spanning multiple clouds. Can run on prem/local, and cloud + prem hybrid deployments
Available on all public clouds, e.g., AWS-Google Cloud-Azure; can run a single logical cluster spanning multiple clouds. Can run on prem/local, and cloud + prem hybrid deployments. Multi-cloud is possible via self-managed deployments but without a unified cross-cloud cluster
Change Data Capture (CDC)
NATIVE. CHANGEFEED command enables scalable, resilient streaming of data changes to Kafka, cloud storage, and webhooks; no third-party CDC tool needed. CDC Queries enable SQL-based filtering and transformation of streams
NATIVE/PARTIAL. SingleStore Pipelines provide native ingest from Kafka and S3. CDC egress (streaming changes out) is available via Debezium (MySQL-compatible) or SingleStore Flow (additional cost). Less integrated than CockroachDB's native CHANGEFEED; CDC egress is not a first-class built-in feature of the core engine
Data Integrity & Foreign Keys Support
Provides strict ACID enforcement at the storage layer and full referential integrity. NATIVELY VALIDATES FOREIGN KEYS, explicit CHECKs, and transactional constraints to ensure absolute global correctness across global nodes
Referential integrity NOT enforced. FOREIGN KEYS NOT VALIDATED; accepted in DDL syntax but referential integrity checking is left to application layer. Unique constraints on distributed tables must be a superset of the shard key. Without application-level safeguards, race conditions under concurrent inserts can produce logically corrupt data
Data Model Complexity
LOW. Relational model with strict schemas, normalized tables, joins, and referential integrity. Ideal for managing complex relationships and transactional systems of record; adapts easily to microservices and enterprise legacy systems
MODERATE TO HIGH. Relational, JSON/BSON (via Kai MongoDB-compatible API), vector, full-text, time-series, and geospatial in one engine. Choosing between rowstore and columnstore storage type per table adds design complexity not present in pure OLTP databases. Tables without a shard key must be reference tables replicated to all nodes
Data Residency
STRONG, INTUITIVE, AND LOCALITY-AWARE. Helps fulfill compliance (e.g., GDPR, CCPA) with Row-Level Control: can pin specific rows to specific geographic regions using REGIONAL BY ROW command, while preserving single logical data platform. Business and compliance teams can use simple SQL commands to ensure customer data never leaves specific geographic borders
WORKSPACE-DRIVEN. Data can be pinned to a specific cloud region by deploying a Workspace in that region. No row-level data residency enforcement; no equivalent to REGIONAL BY ROW. GDPR and similar compliance requirements must be addressed through deployment topology and application-level data routing rather than database-native residency controls
Developer Tools // Developer Experience // Ease of Use
Rich ecosystem: Local CLI, web console UI, ORMS, BI tools, SQL clients, native DB migration toolkits, language‑specific drivers, and compatibility with standard PostgreSQL developer tools like psql // PostgreSQL wire protocol-compatible; feels exactly like developing on standard PostgreSQL. Fits effortlessly into existing ORMs, drivers, and frameworks // Can be spun up instantly in any environment (AWS, GCP, on-prem) with the exact same management interface. The cluster manages its own data balancing, scaling, and hardware survival automatically; DBAs do not need to be distributed systems experts to keep it running smoothly.
Web-based console, SQL Notebooks for development and collaboration, MySQL-compatible clients, SingleStore Pipelines for Kafka/S3 ingest, BI tool connectors, and Wasm-based UDF support. Tooling ecosystem is growing but far less mature than PostgreSQL-native ecosystems. // MySQL wire protocol compatible. Requires upfront shard key design decisions that have no equivalent in traditional SQL databases. // Helios Cloud simplifies deployment on major clouds, but key selection requires careful upfront design; choices are difficult to change post-deployment and degrade cross-partition query performance
Distributed ACID Transactions
Fully distributed, multi-row, multi-table ACID transactions out-of-the-box. Fully supported with serializable isolation using distributed consensus (Raft Protocol) across tables, ranges, and regions; strong ACID guarantees
Supported within a partition via MVCC and 2-phase locking. Cross-partition distributed transactions use 2-phase commit without serializable isolation. Multi-partition write transactions carry higher latency and consistency tradeoffs
Enterprise Support
Dedicated 24/7/365 enterprise support directly from Cockroach Labs with strict SLAs and custom engineering channels. Offers global follow-the-sun support (TSE+SRE) with proven reliability and global partnerships with industry leaders. Single Global Incident Management integrates Engineering + Support + Customer Success in one channel for consistency/immediacy
Commercial support tiers via SingleStore direct support. Acquired by Vector Capital (private equity) in 2025, introducing questions about long-term product investment and support continuity. Enterprise SLA-backed support requires an enterprise contract. Support quality has varied across tiers in user reviews
FinOps Support
HIGH. Straightforward pricing based on predictable node usage or consumption metrics. Avoids hidden, fluctuating network traps when moving data across different infrastructure regions. Supports financial governance/FinOps
LOW TO MODERATE. Consumption-based pricing on Helios (compute workspace + storage). HTAP workloads sharing compute can create unpredictable cost spikes when analytical queries consume resources intended for transactional workloads. Self-managed deployments eliminate consumption-based predictability. No native FinOps dashboard integration
Follower Reads
SUPPORTED. Supports follower/replica reads with Bounded (controlled) Staleness, allowing low‑latency local reads from nearby replicas while keeping strong global ordering
Leaf nodes within a cluster handle read traffic for partition-aligned queries. SmartDR secondary can serve read-only queries in the DR region. No equivalent to CockroachDB's bounded-staleness follower reads with explicit staleness control across the full logical cluster. Secondary read replicas may serve stale data depending on asynchronous replication lag
FREEDOM
ZERO VENDOR LOCK-IN. Runs on any public or private cloud, across multiple clouds, via CockroachDB's Bring Your Own Cloud (BYOC) offering, on-premises, bare metal, Kubernetes, self-hosted, or in a hybrid deployment encompassing some or all of these. Business Source License (BSL) but Source Available. Full commercial-grade support directly from CockroachDB
MySQL wire compatibility reduces some migration friction, but SingleStore's distributed shard key model and proprietary Universal Storage engine create operational lock-in. Changing shard keys post-deployment requires data migration. No open-core or BSL model; commercial license only
Joins
Executes fully distributed hash joins, merge joins, and lookup joins across arbitrary nodes with CockroachDB's advanced Cost-based Optimizer. Full standard SQL support for complex INNER, OUTER, LEFT, RIGHT joins across distributed tables
SQL join support via vectorized, distributed execution; query plans compiled to machine code for repeated query performance. Single-partition joins are fast; cross-partition joins require scatter-gather coordination across leaf nodes. Complex joins that cannot be co-located by shard key degrade in performance; table families help but require upfront design
LDAP Support
NATIVE. Direct native support for external authentication systems like LDAP, Active Directory, GSSAPI, and OIDC
VIA INTEGRATIONS. Supported through integrations with external identity providers including LDAP, Okta, Ping Identity, and Azure Active Directory. Enterprise authentication features including SAML and SSO available on Enterprise Edition. Identity and authentication are delegated to external providers rather than natively managed within the database engine
Migrations
Uses MOLT (Migration Off Legacy Technology) Toolkit & change data capture (CDC): MOLT handles schema conversion/verification and CDC moves data out. PostgreSQL wire protocol compatibility enables lift-and-shift; shadow mode testing
BryteFlow (acquired) assists with data onboarding from common databases. MySQL wire compatibility eases migration from MySQL. PostgreSQL and other RDBMS migrations require SQL dialect adaptation and upfront shard key design decisions. No equivalent to CockroachDB's MOLT toolkit
Multi-Data-Center Support
FULL. Connects geographically isolated, heterogeneous data centers (AWS, GCP, Azure, on-prem) into a single logical cluster, supported by features such as Physical Cluster Replication (PCR) and Logical Data Replication (LDR)
PARTIAL. SmartDR supports asynchronous replication between datacenters for DR. Within a single datacenter, Multi-AZ provides zone failure resilience on Helios. Cross-datacenter active-active operation is not natively supported; it requires separate cluster deployments and application-level data routing between them
Multi-region Functionality // Multi-region Writes
ACTIVE-ACTIVE: Read/Write from any node in any region; built-in low-latency local access patterns and Survival Goals (e.g., ALTER DATABASE...SURVIVE REGION FAILURE) commands configure fault tolerance intent // True multi‑region, multi‑active writes: any node in any region can serve reads and writes while preserving serializable consistency guarantees
NOT NATIVELY ACTIVE-ACTIVE across geographic regions. Multi-region is primarily a disaster recovery architecture via SmartDR (asynchronous replication to a secondary region) // Multi-region distributed writes with strong consistency across regions not natively supported. SmartDR provides asynchronous replication to a secondary region for DR only, not active-active multi-region writes. Applications requiring low-latency writes from multiple geographic regions need separate cluster deployments without a native unified consistency model
Replication
Built-in, automatic consensus replication using the Raft protocol; data is divided into ranges and replicated across nodes
Within-cluster partition replication provides fault tolerance. SmartDR (Enterprise Edition) provides asynchronous replication to a secondary region for disaster recovery. Replication is asynchronous between regions — the secondary may lag the primary, meaning RPO is greater than zero in a regional failover scenario
Required Downtime
ZERO. Online schema changes, rolling upgrades, and cluster expansion occur without taking the data platform offline
SOME REQUIRED. Adding nodes and rebalancing partitions can occur online, but shard key changes require data migration with associated downtime, and storage type changes (rowstore to columnstore) are also operationally intensive. Rolling upgrades on Helios are managed by SingleStore. Most common DDL operations do not require full downtime but may affect concurrency
Resilience
Five 9s availability: Survives node/disk/rack/region failures automatically via Raft consensus, with zero data loss (RPO=0). Naturally resilient to outages with granular row-level control
High availability via partition replication within a cluster and multi-AZ failover for zone-level resilience. Asynchronous cross-region replication for disaster recovery, which means the secondary may lag behind the primary, introducing the risk of potential data loss during a regional failover event
Scale
Virtually unlimited horizontal scale-out. Automatic, seamless handling of growing datasets; increase storage and throughput capacity linearly simply by adding more nodes
Horizontal scale-out via adding leaf nodes; compute and storage scale independently. Shard key design must be planned upfront: cross-partition queries degrade as the cluster grows if workloads do not align with the shard key. Works best when query patterns are designed around the partition topology
Schema Changes
FULLY ONLINE & NON-BLOCKING. Online transactional schema changes (add/alter columns, indexes, constraints) run in the background without locking tables with zero downtime. Designed for always‑on services
PARTIALLY ONLINE. Supports Alter Table operations for common changes (add/drop column, add index). Shard key changes require table recreation with data migration. Storage type changes (rowstore to columnstore) are operationally intensive. Most common DDL operations do not require full downtime but may affect query concurrency during execution
Security-Privacy-Compliance
RBAC, Encryption at Rest with Customer Managed Encryption Keys (CMEK), TLS encryption in transit, IAM integrations, column-level encryption, and robust data-masking natively. Fine-grained encryption at cluster, database, table, or partition levels. Certified SOC 2 Type II and SOC 3, PCI-DSS, HIPAA, and ISO 27001-27017-27018 compliant, with ISO 42001 (Responsible, Ethical, and Safe AI Governance) pending. CockroachDB CIS Benchmarks to deploy hardened CockroachDB configurations. Supports GDPR and DORA compliance
RBAC, encryption at rest and in transit, CMEK (Enterprise Edition), audit logging, and integrations with Okta, Ping, and Azure AD. Certified SOC 2 Type 2, ISO/IEC 27001, and GDPR-compliant. Data residency controls require deployment topology decisions rather than declarative row-level SQL commands as in CockroachDB
SQL Compatibility
HIGH. PostgreSQL Wire Compatible: Uses PG wire protocol; strong ANSI SQL with complex queries, joins, window functions, triggers, stored procedures, and UDFs. Supports spatial data, extensions, syntax; most apps connect with minimal or no changes
HIGH. MySQL wire protocol compatible; most MySQL drivers and clients connect without changes. ANSI SQL supported with window functions, CTEs, and analytic functions. Not PostgreSQL wire-compatible, so enforced foreign keys, deferrable constraints, and full trigger semantics are not supported, requiring application-layer workarounds for common RDBMS patterns
Stored Procedures
SUPPORTED AND MATURE. PL/pgSQL and other languages such as Python and Perl support deep procedural logic, autonomous transactions, and complex business rule enforcement. Supports user-defined stored procedures
SUPPORTED. Supported via stored procedures written in MySQL-compatible SQL/PSM syntax. Python UDFs and Wasm-based user-defined functions are available. Less procedural depth than PostgreSQL's PL/pgSQL or Oracle's PL/SQL. Complex logic relying on triggers or deferred constraints must be adapted or moved to application code
Transaction Performance // Isolation Levels
Optimized for OLTP with strong consistency; cross‑region transactions maintain data correctness. Optimizations like Parallel Commits drop distributed execution overhead to a single network round-trip for most transactions // Enforces strict Serializable isolation exclusively, the strongest isolation level, to ensure zero data anomalies under heavy parallel traffic, and Read Committed
Optimized for high-throughput ingest via lock-free structures and fast analytics via vectorized, compiled query execution. Single-partition OLTP transactions are fast; multi-partition transactions incur coordination overhead through the aggregator. Performance under concurrent write contention is less predictable than dedicated OLTP engines // Read Committed is the default ceiling for distributed transactions. Serializable isolation is not supported—an architectural limitation—so applications requiring strict serializability must implement application-level concurrency controls or accept the risk of data anomalies under concurrent load
Transactional Consistency
Distributed ACID with serializable isolation by default guarantees strict consistency across all nodes and regions using distributed consensus
Read Committed by default. Serializable isolation is not natively supported; documented guidance is that referential integrity and strong consistency should be enforced at the application layer. Snapshot isolation is the practical ceiling, leaving some concurrent write scenarios exposed to read anomalies
Triggers & Deferrable Constraints
FULLY SUPPORTED. Supports triggers and deferrable constraints across all deployment models
LIMITED. Trigger support is limited in SingleStore's distributed, lock-free architecture. Full trigger semantics conflict with the lock-free ingest model; comprehensive trigger support as found in traditional RDBMSs is not available. Deferrable constraints are not supported. Applications relying on database-level triggers must move that logic to the application tier
Vector Search
BUILT-IN NATIVE VECTOR SEARCH, scalable distributed HNSW/IVF indexing, and pgvector (the industry standard for vector similarity search). CockroachDB's C-SPANN provides distributed vector indexing (ANN) at scale; available across all tiers. Suited for AI/ML inference and RAG applications where vectors and transactional data coexist in one engine without a separate vector database
BUILT-IN NATIVE VECTOR SEARCH with IVF, HNSW, and PQ indexing; supports K-NN and ANN queries alongside SQL, full-text search, and JSON. Suited for AI/ML inference and RAG applications where vectors and transactional data coexist in one engine without a separate vector database
Writes and Query Routing
Every node is a gateway to the entirety of the database for unlimited reads and writes in any region. Any node can accept SQL queries; a Distributed Optimizer routes work to the right ranges/replicas based on locality and cost
Aggregator nodes route queries to the appropriate leaf partitions based on shard keys. Single-partition writes are fast and local; multi-partition writes require aggregator coordination. Read traffic distributes across leaf nodes for partition-aligned queries. Cross-partition queries incur scatter-gather overhead across all relevant leaves
PRICING
SIMPLE. Commercial Enterprise: Simple, straightforward pricing, plus the ability to tie data to a location to avoid egress costs. Free for single-node/dev. Free Community Tier
CONSUMPTION-BASED pricing on Helios (compute workspace + storage). Free shared tier available; Standard and Enterprise editions for production. CMEK, SmartDR, and audit logging require Enterprise Edition, adding to base cost. HTAP workloads sharing compute can produce unpredictable bills when analytical query spikes consume transactional resources