This feature is in preview and subject to change. To share feedback and/or issues, contact Support.
Logical data replication is only supported in CockroachDB self-hosted clusters.
In this tutorial, you will set up logical data replication (LDR) streaming data from a source table to a destination table between two CockroachDB clusters. Both clusters are active and can serve traffic. You can apply the outlined steps to set up one of the following:
- Unidirectional LDR from a source table to a destination table (cluster A to cluster B) in one LDR job.
- Bidirectional LDR for the same table from cluster A to cluster B and from cluster B to cluster A. In a bidirectional setup, each cluster operates as both a source and a destination in separate LDR jobs.
New in v25.1:
Create the new table on the destination cluster automatically and conduct a fast, offline initial scan with the CREATE LOGICALLY REPLICATED
syntax. CREATE LOGICALLY REPLICATED
accepts unidirectional
or bidirectional on
as an option in order to create one of the setups automatically. Step 3 outlines when to use the CREATE LOGICALLY REPLICATED
or the CREATE LOGICAL REPLICATION STREAM
syntax to start LDR.
In the following diagram, LDR stream 1 creates a unidirectional LDR setup, introducing LDR stream 2 extends the setup to bidirectional.
For more details on use cases, refer to the Logical Data Replication Overview.
Tutorial overview
If you're setting up bidirectional LDR, both clusters will act as a source and a destination in the respective LDR jobs. The high-level steps for setting up bidirectional or unidirectional LDR:
- Prepare the clusters with the required settings, users, and privileges according to the LDR setup.
- Set up external connection(s) on the destination to hold the connection URI for the source.
- Start LDR from the destination cluster with your required modes and syntax.
- Check the status of the LDR job in the DB Console.
Before you begin
You'll need:
- Two separate v25.1 CockroachDB self-hosted clusters with connectivity between every node in both clusters. That is, all nodes in cluster A must be able to contact each node in cluster B and vice versa. The SQL advertised address should be the cluster node advertise address so that the LDR job can plan node-to-node connections between clusters for maximum performance.
- To set up each cluster, you can follow Deploy CockroachDB on Premises.
- The Deploy CockroachDB on Premises tutorial creates a self-signed certificate for each self-hosted cluster. To create certificates signed by an external certificate authority, refer to Create Security Certificates using OpenSSL.
- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to Step 2. Connect from the destination to the source.
- LDR replicates at the table level, which means clusters can contain other tables that are not part of the LDR job. (For
CREATE LOGICAL REPLICATION STREAM
only): If both clusters are empty, create the tables that you need to replicate with identical schema definitions (excluding indexes) on both clusters. If one cluster already has an existing table that you'll replicate, ensure the other cluster's table definition matches. For more details on the supported schemas, refer to Schema Validation.
Schema validation
Before you start LDR, you must ensure that all column names, types, constraints, and unique indexes on the destination table match with the source table.
You cannot use LDR on a table with a schema that contains:
- Column families
- Partial indexes and hash-sharded indexes
- Indexes with a virtual computed column
- Composite types in the primary key
Additionally, for the CREATE LOGICALLY REPLCATED
syntax, you cannot use LDR on a table with a schema that contains:
- User-defined types
- Foregin key dependencies
For more details, refer to the LDR Known limitations.
When you run LDR in immediate
mode, you cannot replicate a table with foreign key constraints. In validated
mode, foreign key constraints must match. All constraints are enforced at the time of SQL/application write.
Step 1. Prepare the cluster
In this step you'll prepare the required settings and privileges for LDR.
If you are setting up bidirectional LDR, you must run this step on both clusters.
Enter the SQL shell for both clusters in separate terminal windows:
cockroach sql --url "postgresql://root@{node IP or hostname}:26257?sslmode=verify-full" --certs-dir=certs
Enable the
kv.rangefeed.enabled
cluster setting on the source cluster:SET CLUSTER SETTING kv.rangefeed.enabled = true;
On the destination, create a user with the
REPLICATION
system privilege who will start the LDR job:CREATE USER {your username} WITH PASSWORD '{your password}';
GRANT SYSTEM REPLICATION TO {your username};
To change the password later, refer to
ALTER USER
.
Step 2. Connect from the destination to the source
In this step, you'll set up external connection(s) to store the connection string for one or both clusters. Depending on how you manage certificates, you must ensure that all nodes between the clusters have access to the certificate of the other cluster.
You can use the cockroach encode-uri
command to generate a connection string containing a cluster's certificate.
On the source cluster in a new terminal window, generate a connection string, by passing the replication user, node IP, and port, along with the directory to the source cluster's CA certificate:
cockroach encode-uri {user}:{password}@{node IP}:26257 --ca-cert {path to CA certificate} --inline
The connection string output contains the source cluster's certificate:
{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A
In the SQL shell on the destination cluster, create an external connection using the source cluster's connection string. Prefix the
postgresql://
scheme to the connection string and replace{source}
with your external connection name:CREATE EXTERNAL CONNECTION {source} AS 'postgresql://{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A';
(Optional) Bidirectional: Create the connection for LDR stream 2
(Optional) For bidirectional LDR, you'll need to repeat creating the certificate output and the external connection for the opposite cluster. Both clusters will act as the source and destination. At this point, you've created an external connection for LDR stream 1, so cluster A (source) to B (destination). Now, create the same for LDR stream 2 cluster B (source) to cluster A (destination).
On cluster B, run:
cockroach encode-uri {user}:{password}@{node IP}:26257 --ca-cert {path to CA certificate} --inline
The connection string output contains the source cluster's certificate:
{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A
On cluster A, create an external connection using cluster B's connection string (source in LDR stream 2). Prefix the
postgresql://
scheme to the connection string and replace{source}
with your external connection name:CREATE EXTERNAL CONNECTION {source} AS 'postgresql://{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A';
Step 3. Start LDR
In this step, you'll start the LDR stream(s) from the destination cluster. You can replicate one or multiple tables in a single LDR job. You cannot replicate system tables in LDR, which means that you must manually apply configurations and cluster settings, such as row-level TTL and user permissions on the destination cluster.
Modes determine how LDR replicates the data to the destination cluster. There are two modes:
immediate
(default): Attempts to replicate the changed row directly into the destination table, without re-running constraint validations. It does not support writing into tables with foreign key constraints.validated
: Attempts to apply the write in a similar way to a user-run query, which would re-run all constraint validations relevant to the destination table(s). If the change violates foreign key dependencies, unique constraints, or other constraints, the row will be put in the dead letter queue (DLQ) instead. Like the SQL layer,validated
mode does not recognize deletion tombstones. As a result, an update to the same key from cluster A will successfully apply on cluster B, even if that key was deleted on cluster B before the LDR job streamed the cluster A update to the key.
Syntax
LDR streams can be started using one of the following SQL statements, depending on your requirements:
- New in v25.1:
CREATE LOGICALLY REPLICATED
: Creates the new table on the destination cluster automatically, and conducts a fast, offline initial scan.CREATE LOGICALLY REPLICATED
acceptsunidirectional
orbidirectional on
as an option in order to create either one of the setups automatically. The table cannot contain a user-defined types or foreign key dependencies. Follow these steps for setup instructions. CREATE LOGICAL REPLICATION STREAM
: Starts the LDR stream after you've created the matching table on the destination cluster. If the table contains user-defined types or foreign key dependencies, you must use this syntax. Allows for manual creation of unidirectional or bidirectional LDR. Follow these steps for setup instructions.
Also, for both SQL statements, note:
- It is necessary to use the fully qualified table name for the source table and destination table in the statement.
- There are some tradeoffs between enabling one table per LDR job versus multiple tables in one LDR job. Multiple tables in one LDR job can be easier to operate. For example, if you pause and resume the single job, LDR will stop and resume for all the tables. However, the most granular level observability will be at the job level. One table in one LDR job will allow for table-level observability.
CREATE LOGICALLY REPLICATED
Use CREATE LOGICALLY REPLICATED
to create either a unidirectional or bidirectional LDR stream automatically:
Unidirectional LDR: run the following from the destination cluster:
CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional;
Bidirectional LDR: This statement will first create the LDR jobs for the first stream. You must run it from the destination cluster that does not contain the table. Once the offline initial scan completes, the reverse stream will be initialized so that the original destination cluster can send changes to the original source.
Run the following from the destination cluster (i.e, the cluster that does not have the table currently):
CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH bidirectional ON 'external://destination';
You can include multiple tables in the LDR stream for unidirectional or bidirectional setups. Ensure that the table name in the source table list and destination table list are in the same order so that the tables correctly map between the source and destination for replication:
CREATE LOGICALLY REPLICATED TABLE ({database.public.destination_table_name_1}, {database.public.destination_table_name_2}) FROM TABLE ({database.public.source_table_name_1}, {database.public.source_table_name_2}) ON 'external://source' WITH bidirectional ON 'external://destination', label=track_job;
With the LDR streams created, move to Step 4 to manage and monitor the jobs.
CREATE LOGICAL REPLICATION STREAM
Ensure you've created the table on the destination cluster with a matching schema definition to the source cluster table. From the destination cluster, start LDR. Use the fully qualified table name for the source and destination tables:
CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.source_table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.destination_table_name};
You can change the default mode
using the WITH mode = validated
syntax.
If you would like to add multiple tables to the LDR job, ensure that the table name in the source table list and destination table list are in the same order:
CREATE LOGICAL REPLICATION STREAM FROM TABLES ({database.public.source_table_name_1},{database.public.source_table_name_2},...) ON 'external://{source_external_connection}' INTO TABLES ({database.public.destination_table_name_1},{database.public.destination_table_name_2},...);
(Optional) At this point, you've set up one LDR stream from cluster A as the source to cluster B as the destination. To set up LDR streaming in the opposite direction using CREATE LOGICAL REPLICATION STREAM
, run the statement again but cluster B will now be the source, and cluster A will be the destination.
Step 4. Manage and monitor the LDR jobs
Once LDR has started, an LDR job will run on the destination cluster. You can pause, resume, or cancel the LDR job with the job ID. Use SHOW LOGICAL REPLICATION JOBS
to display the LDR job IDs:
SHOW LOGICAL REPLICATION JOBS;
job_id | status | tables | replicated_time
----------------------+---------+---------------------------+------------------
1012877040439033857 | running | {database.public.table} | NULL
(1 row)
If you're setting up bidirectional LDR, both clusters will have a history retention job and an LDR job running.
DB Console
You'll access the DB Console and monitor the status and metrics for the created LDR jobs. Depending on which cluster you would like to view, follow the instructions for either the source or destination.
You can use the DB Console, the SQL shell, Metrics Export with Prometheus and Datadog, and labels with some LDR metrics to monitor the job.
For a full reference on monitoring LDR, refer to Logical Data Replication Monitoring.
- Access the DB Console at
http://{node IP or hostname}:8080
and enter your user's credentials. - On the source cluster, navigate to the Jobs page to view a list of all jobs. Use the job Type dropdown and select Replication Producer. This will display the history retention job. This will run while the LDR job is active to protect changes to the table from garbage collection until they have been replicated to the destination cluster.
- On the destination cluster, use the job Type dropdown and select Logical Replication Ingestion. This page will display the logical replication stream job. There will be a progress bar in the Status column when LDR is replicating a table with existing data. This progress bar shows the status of the initial scan, which backfills the destination table with the existing data.
- On the destination cluster, click on Metrics in the left-hand navigation menu. Use the Dashboard dropdown to select Logical Data Replication. This page shows graphs for monitoring LDR.
What's next
- Manage Logical Data Replication: Manage the DLQ and schema changes for the replicating tables.
CREATE LOGICAL REPLICATION STREAM
- Data Resilience