Changefeed Best Practices

On this page Carat arrow pointing down

This page describes best practices to consider when starting changefeeds on a CockroachDB cluster. We recommend referring to this information while planning your cluster's changefeeds and following the links in each of the sections for more details on a topic.

Plan the number of watched tables for a single changefeed

When creating a changefeed, it's important to consider the number of changefeeds versus the number of tables to include in a single changefeed:

  • Changefeeds each have their own memory overhead, so every running changefeed will increase total memory usage.
  • Creating a single changefeed that will watch hundreds of tables can affect the performance of a changefeed by introducing coupling, where the performance of a target table affects the performance of the changefeed watching it. For example, any schema change on any of the tables will affect the entire changefeed's performance.

To watch multiple tables, we recommend creating a changefeed with a comma-separated list of tables. However, we do not recommend creating a single changefeed for watching hundreds of tables.

Cockroach Labs recommends monitoring your changefeeds to track retryable errors and protected timestamp usage. Refer to the Monitor and Debug Changefeeds page for more information.

Maintain system resources and running changefeeds

When you are running more than 10 changefeeds on a cluster, it is important to monitor the CPU usage. A larger cluster will be able to run more changefeeds concurrently compared to a smaller cluster with more limited resources.

Note:

We recommend limiting the number of changefeeds per cluster to 80.

To maintain a high number of changefeeds in your cluster:

  • Connect to different nodes to create each changefeed. The node on which you start the changefeed will become the coordinator node for the changefeed job. The coordinator node acts as an administrator: keeping track of all other nodes during job execution and the changefeed work as it completes. As a result, this node will use more resources for the changefeed job. For more detail, refer to How does an Enterprise changefeed work?.
  • Consider logically grouping the target tables into one changefeed. When a changefeed pauses, it will stop emitting messages for the target tables. Grouping tables of related data into a single changefeed may make sense for your workload. However, we do not recommend watching hundreds of tables in a single changefeed. For more detail on protecting data from garbage collection when a changefeed is paused, refer to Garbage collection and changefeeds.

Monitor changefeeds

We recommend starting the changefeed with the metrics_label option, which allows you to measure metrics per changefeed. Metrics label information is sent with time-series metrics to the Prometheus endpoint.

The key areas to monitor when running changefeeds:

Manage changefeeds and schema changes

When a schema change is issued that causes a column backfill, it can result in a changefeed emitting duplicate messages for an event. We recommend issuing schema changes outside of explicit transactions to make use of the declarative schema changer, which does not perform column backfill for the schema changes it supports. For more details on schema changes and column backfill generally, refer to the Online Schema Changes page.

You can also use the schema_change_events and schema_change_policy options to define a schema change type and an associated policy that will modify how the changefeed behaves under the schema change.

Lock the schema on changefeed watched tables

Use the schema_locked storage parameter to disallow schema changes on a watched table, which allows the changefeed to take a fast path that avoids checking if there are schema changes that could require synchronization between changefeed aggregators. This helps to decrease the latency between a write committing to a table and it emitting to the changefeed's sink. Enabling schema_locked

Enable schema_locked on the watched table with the ALTER TABLE statement:

icon/buttons/copy
ALTER TABLE watched_table SET (schema_locked = true);

While schema_locked is enabled on a table, attempted schema changes on the table will be rejected and an error returned. If you need to run a schema change on the locked table, unlock the table with schema_locked = false, complete the schema change, and then lock the table again with schema_locked = true. The changefeed will run as normal while schema_locked = false, but it will not benefit from the performance optimization.

icon/buttons/copy
ALTER TABLE watched_table SET (schema_locked = false);

See also

For details on tuning changefeeds for throughput, durability, and improving latency, refer to the Advanced Changefeed Configuration page.


Yes No
On this page

Yes No