Set Up Logical Data Replication

On this page

Note:

Logical data replication is only supported in CockroachDB self-hosted clusters.

In this tutorial, you will set up logical data replication (LDR) streaming data from a source table to a destination table between two CockroachDB clusters. Both clusters are active and can serve traffic. You can apply the outlined steps to set up one of the following:

Unidirectional LDR from a source table to a destination table (cluster A to cluster B) in one LDR job.
Bidirectional LDR for the same table from cluster A to cluster B and from cluster B to cluster A. In a bidirectional setup, each cluster operates as both a source and a destination in separate LDR jobs.

In the following diagram, LDR stream 1 creates a unidirectional LDR setup. Introducing LDR stream 2 extends the setup to bidirectional.

Diagram showing bidirectional LDR from cluster A to B and back again from cluster B to A.

For more details on use cases, refer to the Logical Data Replication Overview.

Syntax

LDR streams can be started using one of the following SQL statements, depending on your requirements:

CREATE LOGICALLY REPLICATED: Creates the new table on the destination cluster automatically, and conducts a fast, offline initial scan. CREATE LOGICALLY REPLICATED accepts unidirectional or bidirectional on as an option in order to create either one of the setups automatically. The table cannot contain user-defined types. Follow these steps for setup instructions.
CREATE LOGICAL REPLICATION STREAM: Starts the LDR stream after you've created the matching table on the destination cluster. If the table contains user-defined types, you must use this syntax. Allows for manual creation of unidirectional or bidirectional LDR. Follow these steps for setup instructions.

Also, for both SQL statements, note:

It is necessary to use the fully qualified table name for the source table and destination table in the statement.
There are some tradeoffs between enabling one table per LDR job versus multiple tables in one LDR job. Multiple tables in one LDR job can be easier to operate. For example, if you pause and resume the single job, LDR will stop and resume for all the tables. However, the most granular level observability will be at the job level. One table in one LDR job will allow for table-level observability.

Tutorial overview

If you're setting up bidirectional LDR, both clusters will act as a source and a destination in the respective LDR jobs. The high-level steps for setting up bidirectional or unidirectional LDR:

Prepare the clusters with the required settings, users, and privileges according to the LDR setup.
Set up external connection(s) on the destination to hold the connection URI for the source.
Start LDR from the destination cluster with your required syntax and options.
Check the status of the LDR job in the DB Console.

Before you begin

You'll need:

Two separate v25.3 CockroachDB self-hosted clusters with connectivity between every node in both clusters. That is, all nodes in cluster A must be able to contact each node in cluster B and vice versa. The SQL advertised address should be the cluster node advertise address so that the LDR job can plan node-to-node connections between clusters for maximum performance. The source and destination clusters must be configured with similar hardware profiles, number of nodes, and overall size. Significant discrepancies in the cluster configurations may result in degraded performance.
- To set up each cluster, you can follow Deploy CockroachDB on Premises.
- The Deploy CockroachDB on Premises tutorial creates a self-signed certificate for each self-hosted cluster. To create certificates signed by an external certificate authority, refer to Create Security Certificates using OpenSSL.
- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to Step 2. Connect from the destination to the source.
LDR replicates at the table level, which means clusters can contain other tables that are not part of the LDR job. (For CREATE LOGICAL REPLICATION STREAM only): If both clusters are empty, create the tables that you need to replicate with identical schema definitions (excluding indexes) on both clusters. If one cluster already has an existing table that you'll replicate, ensure the other cluster's table definition matches. For more details on the supported schemas, refer to Schema Validation.

Schema validation

Before you start LDR, you must ensure that all column names, types, constraints, and unique indexes on the destination table match with the source table.

You cannot use LDR on a table with a schema that contains:

Column families
Partial indexes and hash-sharded indexes
Indexes with a virtual computed column
Composite types in the primary key
Foreign key dependencies

Additionally, for the CREATE LOGICALLY REPLICATED syntax, you cannot use LDR on a table with a schema that contains:

User-defined types

For more details, refer to the LDR Known limitations.

Unique secondary indexes

When the destination table includes unique secondary indexes, it can cause rows to enter the dead letter queue (DLQ). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other.

If the application modifies the same row in both clusters, LDR resolves the conflict using last write wins (LWW) conflict resolution. UNIQUE constraints are validated locally in each cluster, therefore if a replicated write violates a UNIQUE constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ.

For example, consider a table with a unique name column where the following operations occur in this order in a source and destination cluster running LDR:

On the source cluster:

-- writes to the source table
INSERT INTO city (1, nyc); -- timestamp 1
UPDATE city SET name = 'philly' WHERE id = 1; -- timestamp 2
INSERT INTO city (100, nyc); -- timestamp 3

LDR replicates the write to the destination cluster:

-- replicates to the destination table
INSERT INTO city (100, nyc); -- timestamp 4

Timestamp 5: Range containing primary key 1 on the destination cluster is unavailable for a few minutes due to a network partition.

Timestamp 6: On the destination cluster, LDR attempts to replicate the row (1, nyc), but it enters the retry queue for 1 minute due to the unavailable range. LDR adds 1, nyc to the DLQ table after retrying and observing the UNIQUE constraint violation:

-- writes to the DLQ
INSERT INTO city (1, nyc); -- timestamp 6

Timestamp 7: LDR continues to replicate writes:

-- replicates to the destination table
INSERT INTO city (1, philly); -- timestamp 7

To prevent expected DLQ entries and allow LDR to be eventually consistent, we recommend:

For unidirectional LDR, validate unique index constraints on the source cluster only.
For bidirectional LDR, remove unique index constraints on both clusters.

Step 1. Prepare the cluster

In this step you'll prepare the required settings and privileges for LDR.

Note:

If you are setting up bidirectional LDR, you must run this step on both clusters.

Enter the SQL shell for both clusters in separate terminal windows:
```
cockroach sql --url "postgresql://root@{node IP or hostname}:26257?sslmode=verify-full" --certs-dir=certs
```
Enable the kv.rangefeed.enabled cluster setting on the source cluster:
```
SET CLUSTER SETTING kv.rangefeed.enabled = true;
```
On the destination, create a user who will start the LDR job:
```
CREATE USER {your_username} WITH PASSWORD '{your_password}';
```
Choose the appropriate privilege based on the SQL statement the user on the destination cluster will run. (For details on which syntax to use, refer to the Syntax section at the beginning of this tutorial):
- CREATE LOGICAL REPLICATION STREAM (replicating into an existing table). Grant the REPLICATIONDEST privilege on the destination table, which allows the user to stream data into the existing table:
```
GRANT REPLICATIONDEST ON TABLE {your_db}.{your_schema}.{your_table} TO {your_username};
```
- CREATE LOGICALLY REPLICATED (creating a new table as part of the replication). Grant the CREATE privilege on the parent database, which allows the user to create a new table in the specified database, and the user will automatically have REPLICATIONDEST on the table they create:
```
GRANT CREATE ON DATABASE {your_db} TO {your_username};
```
On the source, grant the user who will be specified in the connection string to the source cluster the REPLICATIONSOURCE privilege:
```
GRANT REPLICATIONSOURCE ON TABLE {your_db}.{your_schema}.{your_table} TO {your_username};
```
(Optional) If you are setting up bidirectional LDR, each cluster must authorize both stream directions using the table-level privileges depending on the syntax you're using:
- CREATE LOGICAL REPLICATION STREAM (setting up a reverse stream manually). Grant REPLICATIONDEST and REPLICATIONSOURCE to the users in the reverse direction.
- CREATE LOGICALLY REPLICATED (setting up a bidirectional stream automatically). Grant the original source user REPLICATIONDEST on the tables.

Note:

As of v25.2, the REPLICATION system privilege has been deprecated and replaced with the granular, table-level privileges: REPLICATIONSOURCE and REPLICATIONDEST.

To change the password later, refer to ALTER USER.

Step 2. Connect from the destination to the source

In this step, you'll set up external connection(s) to store the connection string for one or both clusters. Depending on how you manage certificates, you must ensure that all nodes between the clusters have access to the certificate of the other cluster.

You can use the cockroach encode-uri command to generate a connection string containing a cluster's certificate.

On the source cluster in a new terminal window, generate a connection string, by passing the replication user, node IP, and port, along with the directory to the source cluster's CA certificate:
```
cockroach encode-uri {user}:{password}@{node IP}:26257 --ca-cert {path to CA certificate} --inline
```
The connection string output contains the source cluster's certificate:
```
{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A
```
In the SQL shell on the destination cluster, create an external connection using the source cluster's connection string. Prefix the postgresql:// scheme to the connection string and replace {source} with your external connection name:
```
CREATE EXTERNAL CONNECTION {source} AS 'postgresql://{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A';
```
If the source and destination cluster's nodes are on different networks, you can route LDR traffic through the destination cluster's load balancer. Add &crdb_route=gateway to the connection string:
```
CREATE EXTERNAL CONNECTION {source} AS 'postgresql://{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&crdb_route=gateway&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A';
```
Note:

For optimal performance, all nodes on the source and destination clusters should share the same virtual network. The gateway route option should only be used when this network configuration is not possible due to firewall or IP allocation constraints.

Once the source cluster has made a connection to the destination cluster, the destination cluster pulls the topology of the source cluster and distributes the replication work across all nodes in the source and destination.

(Optional) Bidirectional: Create the connection for LDR stream 2

(Optional) For bidirectional LDR, you'll need to repeat creating the certificate output and the external connection for the opposite cluster. Both clusters will act as the source and destination. At this point, you've created an external connection for LDR stream 1, so cluster A (source) to B (destination). Now, create the same for LDR stream 2 cluster B (source) to cluster A (destination).

On cluster B, run:

cockroach encode-uri {user}:{password}@{node IP}:26257 --ca-cert {path to CA certificate} --inline

The connection string output contains the source cluster's certificate:

{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A

On cluster A, create an external connection using cluster B's connection string (source in LDR stream 2). Prefix the postgresql:// scheme to the connection string and replace {source} with your external connection name:

CREATE EXTERNAL CONNECTION {source} AS 'postgresql://{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A';

If cluster A and cluster B's nodes are on different networks, you can route LDR traffic through the destination cluster's load balancer. Add &crdb_route=gateway to the connection string:

CREATE EXTERNAL CONNECTION {source} AS 'postgresql://{user}:{password}@{node IP}:26257?options=-ccluster%3Dsystem&crdb_route=gateway&sslinline=true&sslmode=verify-full&sslrootcert=-----BEGIN+CERTIFICATE-----{encoded certificate}-----END+CERTIFICATE-----%0A';

Step 3. Start LDR

In this step, you'll start the LDR stream(s) from the destination cluster. You can replicate one or multiple tables in a single LDR job. You cannot replicate system tables in LDR, which means that you must manually apply configurations and cluster settings, such as row-level TTL and user permissions on the destination cluster.

LDR streams can be started using one of the following sections for instructions on creating an LDR stream. For details on which syntax to use, refer to the Syntax section at the beginning of this tutorial:

CREATE LOGICALLY REPLICATED
CREATE LOGICAL REPLICATION STREAM

`CREATE LOGICALLY REPLICATED`

Use CREATE LOGICALLY REPLICATED to create either a unidirectional or bidirectional LDR stream automatically:

Unidirectional LDR: run the following from the destination cluster:

CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional;

Bidirectional LDR: This statement will first create the LDR jobs for the first stream. You must run it from the destination cluster that does not contain the table. Once the offline initial scan completes, the reverse stream will be initialized so that the original destination cluster can send changes to the original source.

Run the following from the destination cluster (i.e., the cluster that currently does not have the table):
```
CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH bidirectional ON 'external://destination';
```

You can include multiple tables in the LDR stream for unidirectional or bidirectional setups. Ensure that the table name in the source table list and destination table list are in the same order so that the tables correctly map between the source and destination for replication:

CREATE LOGICALLY REPLICATED TABLE ({database.public.destination_table_name_1}, {database.public.destination_table_name_2}) FROM TABLE ({database.public.source_table_name_1}, {database.public.source_table_name_2}) ON 'external://source' WITH bidirectional ON 'external://destination', label=track_job;

With the LDR streams created, move to Step 4 to manage and monitor the jobs.

`CREATE LOGICAL REPLICATION STREAM`

Ensure you've created the table on the destination cluster with a matching schema definition to the source cluster table. From the destination cluster, start LDR. Use the fully qualified table name for the source and destination tables:

CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.source_table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.destination_table_name};

If you would like to add multiple tables to the LDR job, ensure that the table name in the source table list and destination table list are in the same order:

CREATE LOGICAL REPLICATION STREAM FROM TABLES ({database.public.source_table_name_1},{database.public.source_table_name_2},...)  ON 'external://{source_external_connection}' INTO TABLES ({database.public.destination_table_name_1},{database.public.destination_table_name_2},...);

(Optional) At this point, you've set up one LDR stream from cluster A as the source to cluster B as the destination. To set up LDR streaming in the opposite direction using CREATE LOGICAL REPLICATION STREAM, run the statement again but cluster B will now be the source, and cluster A will be the destination.

Step 4. Manage and monitor the LDR jobs

Once LDR has started, an LDR job will run on the destination cluster. You can pause, resume, or cancel the LDR job with the job ID. Use SHOW LOGICAL REPLICATION JOBS to display the LDR job IDs:

SHOW LOGICAL REPLICATION JOBS;

        job_id        | status  |          tables           | replicated_time
----------------------+---------+---------------------------+------------------
1012877040439033857   | running | {database.public.table}   | NULL
(1 row)

If you're setting up bidirectional LDR, both clusters will have a history retention job and an LDR job running.

DB Console

You can access the DB Console and monitor the status and metrics for the created LDR jobs. Depending on which cluster you would like to view, follow the instructions for either the source or destination.

Tip:

You can use the DB Console, the SQL shell, Metrics Export with Prometheus and Datadog, and labels with some LDR metrics to monitor the job.

For a full reference on monitoring LDR, refer to Logical Data Replication Monitoring.

Access the DB Console at http://{node IP or hostname}:8080 and enter your user's credentials.
On the source cluster, navigate to the Jobs page to view a list of all jobs. Use the job Type dropdown and select Replication Producer. This will display the history retention job. This will run while the LDR job is active to protect changes to the table from garbage collection until they have been replicated to the destination cluster.
On the destination cluster, use the job Type dropdown and select Logical Replication Ingestion. This page will display the logical replication stream job. There will be a progress bar in the Status column when LDR is replicating a table with existing data. This progress bar shows the status of the initial scan, which backfills the destination table with the existing data.
On the destination cluster, click on Metrics in the left-hand navigation menu. Use the Dashboard dropdown to select Logical Data Replication. This page shows graphs for monitoring LDR.

What's next

Manage Logical Data Replication: Manage the DLQ and schema changes for the replicating tables.
CREATE LOGICAL REPLICATION STREAM
Data Resilience

Pricing

Contact us

Sign In

Set Up Logical Data Replication

Syntax

Tutorial overview

Before you begin

Schema validation

Unique secondary indexes

Step 1. Prepare the cluster

Step 2. Connect from the destination to the source

(Optional) Bidirectional: Create the connection for LDR stream 2

Step 3. Start LDR

`CREATE LOGICALLY REPLICATED`

`CREATE LOGICAL REPLICATION STREAM`

Step 4. Manage and monitor the LDR jobs

DB Console

What's next

Tell us about your experience

Thank you for your feedback!

Explore More Documentation:

Set Up Logical Data Replication

Syntax

Tutorial overview

Before you begin

Schema validation

Unique secondary indexes

Step 1. Prepare the cluster

Step 2. Connect from the destination to the source

(Optional) Bidirectional: Create the connection for LDR stream 2

Step 3. Start LDR

CREATE LOGICALLY REPLICATED

CREATE LOGICAL REPLICATION STREAM

Step 4. Manage and monitor the LDR jobs

DB Console

What's next

Tell us about your experience

Select the problem area

Thank you for your feedback!

Explore More Documentation:

`CREATE LOGICALLY REPLICATED`

`CREATE LOGICAL REPLICATION STREAM`