On top of CockroachDB's built-in automation, you can use a third-party orchestration system to simplify and automate even more of your operations, from deployment to scaling to overall cluster management.
This page walks you through a simple demonstration, using the open-source Kubernetes orchestration system. Using either the CockroachDB Helm chart or a few configuration files, you'll quickly create a 3-node local cluster. You'll run some SQL commands against the cluster and then simulate node failure, watching how Kubernetes auto-restarts without the need for any manual intervention. You'll then scale the cluster with a single command before shutting the cluster down, again with a single command.
To orchestrate a physically distributed cluster in production, see Orchestrated Deployments. To deploy a free CockroachDB Cloud cluster instead of running CockroachDB yourself, see the Quickstart.
Before you begin
Before getting started, it's helpful to review some Kubernetes-specific terminology:
Feature | Description |
---|---|
minikube | This is the tool you'll use to run a Kubernetes cluster inside a VM on your local workstation. |
pod | A pod is a group of one of more Docker containers. In this tutorial, all pods will run on your local workstation, each containing one Docker container running a single CockroachDB node. You'll start with 3 pods and grow to 4. |
StatefulSet | A StatefulSet is a group of pods treated as stateful units, where each pod has distinguishable network identity and always binds back to the same persistent storage on restart. StatefulSets are considered stable as of Kubernetes version 1.9 after reaching beta in version 1.5. |
persistent volume | A persistent volume is a piece of storage mounted into a pod. The lifetime of a persistent volume is decoupled from the lifetime of the pod that's using it, ensuring that each CockroachDB node binds back to the same storage on restart. When using minikube , persistent volumes are external temporary directories that endure until they are manually deleted or until the entire Kubernetes cluster is deleted. |
persistent volume claim | When pods are created (one per CockroachDB node), each pod will request a persistent volume claim to “claim” durable storage for its node. |
Step 1. Start Kubernetes
Follow Kubernetes' documentation to install
minikube
, the tool used to run Kubernetes locally, for your OS. This includes installing a hypervisor andkubectl
, the command-line tool used to manage Kubernetes from your local workstation.Note:Make sure you installminikube
version 0.21.0 or later. Earlier versions do not include a Kubernetes server that supports themaxUnavailability
field andPodDisruptionBudget
resource type used in the CockroachDB StatefulSet configuration.Start a local Kubernetes cluster:
$ minikube start
Step 2. Start CockroachDB
Choose a way to deploy and maintain the CockroachDB cluster:
- CockroachDB Kubernetes Operator (recommended)
- Helm package manager
- Manually apply our StatefulSet configuration and related files
The Operator is currently supported for GKE only.
Install the Operator
Apply the CustomResourceDefinition (CRD) for the Operator:
$ kubectl apply -f https://raw.githubusercontent.com/cockroachdb/cockroach-operator/v2.16.0/install/crds.yaml
customresourcedefinition.apiextensions.k8s.io/crdbclusters.crdb.cockroachlabs.com created
Apply the Operator manifest:
$ kubectl apply -f https://raw.githubusercontent.com/cockroachdb/cockroach-operator/v2.16.0/install/operator.yaml
clusterrole.rbac.authorization.k8s.io/cockroach-operator-role created serviceaccount/cockroach-operator-default created clusterrolebinding.rbac.authorization.k8s.io/cockroach-operator-default created deployment.apps/cockroach-operator created
Validate that the Operator is running:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroach-operator-6f7b86ffc4-9ppkv 1/1 Running 0 54s
Configure the cluster
On a production cluster, you will need to modify the StatefulSet configuration with values that are appropriate for your workload.
Download and edit
example.yaml
, which tells the Operator how to configure the Kubernetes cluster.$ curl -O https://raw.githubusercontent.com/cockroachdb/cockroach-operator/v2.16.0/examples/example.yaml
$ vi example.yaml
Allocate CPU and memory resources to CockroachDB on each pod. Enable the commented-out lines in
example.yaml
and substitute values that are appropriate for your workload. For more context on provisioning CPU and memory, see the Production Checklist.Tip:Resource
requests
andlimits
should have identical values.resources: requests: cpu: "2" memory: "8Gi" limits: cpu: "2" memory: "8Gi"
Note:If no resource limits are specified, the pods will be able to consume the maximum available CPUs and memory. However, to avoid overallocating resources when another memory-intensive workload is on the same instance, always set resource requests and limits explicitly.
Modify
resources.requests.storage
to allocate the appropriate amount of disk storage for your workload. This configuration defaults to 60Gi of disk space per pod. For more context on provisioning storage, see the Production Checklist.resources: requests: storage: "60Gi"
Initialize the cluster
By default, the Operator will generate and sign 1 client and 1 node certificate to secure the cluster. To authenticate using your own CA, see Operate CockroachDB on Kubernetes.
Apply
example.yaml
:$ kubectl apply -f example.yaml
The Operator will create a StatefulSet and initialize the nodes as a cluster.
crdbcluster.crdb.cockroachlabs.com/cockroachdb created
Check that the pods were created:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroach-operator-6f7b86ffc4-9t9zb 1/1 Running 0 3m22s cockroachdb-0 1/1 Running 0 2m31s cockroachdb-1 1/1 Running 1 102s cockroachdb-2 1/1 Running 0 46s
Each pod should have
READY
status soon after being created.
Set up configuration file
Download and modify our StatefulSet configuration:
$ curl -O https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/bring-your-own-certs/cockroachdb-statefulset.yaml
Allocate CPU and memory resources to CockroachDB on each pod. These settings should be appropriate for your workload. For more context on provisioning CPU and memory, see the Production Checklist.
Tip:Resource
requests
andlimits
should have identical values.resources: requests: cpu: "2" memory: "8Gi" limits: cpu: "2" memory: "8Gi"
Note:If no resource limits are specified, the pods will be able to consume the maximum available CPUs and memory. However, to avoid overallocating resources when another memory-intensive workload is on the same instance, always set resource requests and limits explicitly.
In the
volumeClaimTemplates
specification, you may want to modifyresources.requests.storage
for your use case. This configuration defaults to 100Gi of disk space per pod. For more details on customizing disks for performance, see these instructions.resources: requests: storage: "100Gi"
Initialize the cluster
The below steps use cockroach cert
commands to quickly generate and sign the CockroachDB node and client certificates. If you use a different method of generating certificates, make sure to update secret.secretName
in the StatefulSet configuration with the name of your node secret.
Create two directories:
$ mkdir certs my-safe-directory
Directory Description certs
You'll generate your CA certificate and all node and client certificates and keys in this directory. my-safe-directory
You'll generate your CA key in this directory and then reference the key when generating node and client certificates. Create the CA certificate and key pair:
$ cockroach cert create-ca \ --certs-dir=certs \ --ca-key=my-safe-directory/ca.key
Create a client certificate and key pair for the root user:
$ cockroach cert create-client \ root \ --certs-dir=certs \ --ca-key=my-safe-directory/ca.key
Upload the client certificate and key to the Kubernetes cluster as a secret:
$ kubectl create secret \ generic cockroachdb.client.root \ --from-file=certs
secret/cockroachdb.client.root created
Create the certificate and key pair for your CockroachDB nodes:
$ cockroach cert create-node \ localhost 127.0.0.1 \ cockroachdb-public \ cockroachdb-public.default \ cockroachdb-public.default.svc.cluster.local \ *.cockroachdb \ *.cockroachdb.default \ *.cockroachdb.default.svc.cluster.local \ --certs-dir=certs \ --ca-key=my-safe-directory/ca.key
Upload the node certificate and key to the Kubernetes cluster as a secret:
$ kubectl create secret \ generic cockroachdb.node \ --from-file=certs
secret/cockroachdb.node created
Check that the secrets were created on the cluster:
$ kubectl get secrets
NAME TYPE DATA AGE cockroachdb.client.root Opaque 3 41m cockroachdb.node Opaque 5 14s default-token-6qjdb kubernetes.io/service-account-token 3 4m
Use the config file you downloaded to create the StatefulSet that automatically creates 3 pods, each running a CockroachDB node:
$ kubectl create -f cockroachdb-statefulset.yaml
serviceaccount/cockroachdb created role.rbac.authorization.k8s.io/cockroachdb created rolebinding.rbac.authorization.k8s.io/cockroachdb created service/cockroachdb-public created service/cockroachdb created poddisruptionbudget.policy/cockroachdb-budget created statefulset.apps/cockroachdb created
Initialize the CockroachDB cluster:
Confirm that three pods are
Running
successfully. Note that they will not be consideredReady
until after the cluster has been initialized:$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 0/1 Running 0 2m cockroachdb-1 0/1 Running 0 2m cockroachdb-2 0/1 Running 0 2m
Confirm that the persistent volumes and corresponding claims were created successfully for all three pods:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-9e435563-fb2e-11e9-a65c-42010a8e0fca 100Gi RWO Delete Bound default/datadir-cockroachdb-0 standard 51m pvc-9e47d820-fb2e-11e9-a65c-42010a8e0fca 100Gi RWO Delete Bound default/datadir-cockroachdb-1 standard 51m pvc-9e4f57f0-fb2e-11e9-a65c-42010a8e0fca 100Gi RWO Delete Bound default/datadir-cockroachdb-2 standard 51m
Run
cockroach init
on one of the pods to complete the node startup process and have them join together as a cluster:$ kubectl exec -it cockroachdb-0 \ -- /cockroach/cockroach init \ --certs-dir=/cockroach/cockroach-certs
Cluster successfully initialized
Confirm that cluster initialization has completed successfully. The job should be considered successful and the Kubernetes pods should soon be considered
Ready
:$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 3m cockroachdb-1 1/1 Running 0 3m cockroachdb-2 1/1 Running 0 3m
The CockroachDB Helm chart is undergoing maintenance for compatibility with Kubernetes versions 1.17 through 1.21 (the latest version as of this writing). No new feature development is currently planned. For new production and local deployments, we currently recommend using a manual configuration (Configs option). If you are experiencing issues with a Helm deployment on production, contact our Support team.
Secure CockroachDB deployments on Amazon EKS via Helm are not yet supported.
Install the Helm client (version 3.0 or higher) and add the
cockroachdb
chart repository:$ helm repo add cockroachdb https://charts.cockroachdb.com/
"cockroachdb" has been added to your repositories
Update your Helm chart repositories to ensure that you're using the latest CockroachDB chart:
$ helm repo update
On a production cluster, you will need to modify the StatefulSet configuration with values that are appropriate for your workload. Modify our Helm chart's
values.yaml
parameters:Create a
my-values.yaml
file to override the defaults invalues.yaml
, substituting your own values in this example based on the guidelines below.Tip:Resource
requests
andlimits
should have identical values.statefulset: resources: limits: cpu: "16" memory: "8Gi" requests: cpu: "16" memory: "8Gi" conf: cache: "2Gi" max-sql-memory: "2Gi" tls: enabled: true
To avoid running out of memory when CockroachDB is not the only pod on a Kubernetes node, you must set memory limits explicitly. This is because CockroachDB does not detect the amount of memory allocated to its pod when run in Kubernetes. We recommend setting
conf.cache
andconf.max-sql-memory
each to 1/4 of thememory
allocation specified instatefulset.resources.requests
andstatefulset.resources.limits
.Tip:For example, if you are allocating 8Gi of
memory
to each CockroachDB node, allocate 2Gi tocache
and 2Gi tomax-sql-memory
.You may want to modify
storage.persistentVolume.size
andstorage.persistentVolume.storageClass
for your use case. This chart defaults to 100Gi of disk space per pod. For more details on customizing disks for performance, see these instructions.For a secure deployment, set
tls.enabled
to true.
Install the CockroachDB Helm chart.
Provide a "release" name to identify and track this particular deployment of the chart, and override the default values with those in
my-values.yaml
.Note:This tutorial uses
my-release
as the release name. If you use a different value, be sure to adjust the release name in subsequent commands.$ helm install my-release --values my-values.yaml cockroachdb/cockroachdb
Behind the scenes, this command uses our
cockroachdb-statefulset.yaml
file to create the StatefulSet that automatically creates 3 pods, each with a CockroachDB node running inside it, where each pod has distinguishable network identity and always binds back to the same persistent storage on restart.As each pod is created, it issues a Certificate Signing Request, or CSR, to have the CockroachDB node's certificate signed by the Kubernetes CA. You must manually check and approve each node's certificate, at which point the CockroachDB node is started in the pod.
Get the names of the
Pending
CSRs:$ kubectl get csr
NAME AGE REQUESTOR CONDITION default.client.root 21s system:serviceaccount:default:my-release-cockroachdb Pending default.node.my-release-cockroachdb-0 15s system:serviceaccount:default:my-release-cockroachdb Pending default.node.my-release-cockroachdb-1 16s system:serviceaccount:default:my-release-cockroachdb Pending default.node.my-release-cockroachdb-2 15s system:serviceaccount:default:my-release-cockroachdb Pending ...
If you do not see a
Pending
CSR, wait a minute and try again.Examine the CSR for the first pod:
$ kubectl describe csr default.node.my-release-cockroachdb-0
Name: default.node.my-release-cockroachdb-0 Labels: <none> Annotations: <none> CreationTimestamp: Mon, 10 Dec 2018 05:36:35 -0500 Requesting User: system:serviceaccount:default:my-release-cockroachdb Status: Pending Subject: Common Name: node Serial Number: Organization: Cockroach Subject Alternative Names: DNS Names: localhost my-release-cockroachdb-0.my-release-cockroachdb.default.svc.cluster.local my-release-cockroachdb-0.my-release-cockroachdb my-release-cockroachdb-public my-release-cockroachdb-public.default.svc.cluster.local IP Addresses: 127.0.0.1 Events: <none>
If everything looks correct, approve the CSR for the first pod:
$ kubectl certificate approve default.node.my-release-cockroachdb-0
certificatesigningrequest.certificates.k8s.io/default.node.my-release-cockroachdb-0 approved
Repeat steps 2 and 3 for the other 2 pods.
Confirm that three pods are
Running
successfully:$ kubectl get pods
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 0/1 Running 0 6m my-release-cockroachdb-1 0/1 Running 0 6m my-release-cockroachdb-2 0/1 Running 0 6m my-release-cockroachdb-init-hxzsc 0/1 Init:0/1 0 6m
Approve the CSR for the one-off pod from which cluster initialization happens:
$ kubectl certificate approve default.client.root
certificatesigningrequest.certificates.k8s.io/default.client.root approved
Confirm that CockroachDB cluster initialization has completed successfully, with the pods for CockroachDB showing
1/1
underREADY
and the pod for initialization showingCOMPLETED
underSTATUS
:$ kubectl get pods
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 8m my-release-cockroachdb-1 1/1 Running 0 8m my-release-cockroachdb-2 1/1 Running 0 8m my-release-cockroachdb-init-hxzsc 0/1 Completed 0 1h
Confirm that the persistent volumes and corresponding claims were created successfully for all three pods:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-71019b3a-fc67-11e8-a606-080027ba45e5 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-0 standard 11m pvc-7108e172-fc67-11e8-a606-080027ba45e5 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-1 standard 11m pvc-710dcb66-fc67-11e8-a606-080027ba45e5 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-2 standard 11m
The StatefulSet configuration sets all CockroachDB nodes to log to stderr
, so if you ever need access to a pod/node's logs to troubleshoot, use kubectl logs <podname>
rather than checking the log on the persistent volume.
Step 3. Use the built-in SQL client
Get a shell into one of the pods and start the CockroachDB built-in SQL client:
$ kubectl exec -it cockroachdb-2 \ -- ./cockroach sql \ --certs-dir cockroach-certs
# Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # # Server version: CockroachDB CCL v20.2.0 (x86_64-unknown-linux-gnu, built 2020/07/29 22:56:36, go1.13.9) (same version as client) # Cluster ID: f82abd88-5d44-4493-9558-d6c75a3b80cc # # Enter \? for a brief introduction. # root@:26257/defaultdb>
Run some basic CockroachDB SQL statements:
> CREATE DATABASE bank;
> CREATE TABLE bank.accounts (id INT PRIMARY KEY, balance DECIMAL);
> INSERT INTO bank.accounts VALUES (1, 1000.50);
> SELECT * FROM bank.accounts;
id | balance +----+---------+ 1 | 1000.50 (1 row)
Create a user with a password:
> CREATE USER roach WITH PASSWORD 'Q7gc8rEdS';
You will need this username and password to access the DB Console later.
Exit the SQL shell and pod:
> \q
To use the built-in SQL client, you need to launch a pod that runs indefinitely with the cockroach
binary inside it, get a shell into the pod, and then start the built-in SQL client.
~~~ shell $ kubectl create \ -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/bring-your-own-certs/client.yaml ~~~
~~~ pod/cockroachdb-client-secure created ~~~
Get a shell into the pod and start the CockroachDB built-in SQL client:
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach sql \ --certs-dir=/cockroach-certs \ --host=cockroachdb-public
# Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # # Server version: CockroachDB CCL v20.2.0 (x86_64-unknown-linux-gnu, built 2020/07/29 22:56:36, go1.13.9) (same version as client) # Cluster ID: f82abd88-5d44-4493-9558-d6c75a3b80cc # # Enter \? for a brief introduction. # root@:26257/defaultdb>
Tip:This pod will continue running indefinitely, so any time you need to reopen the built-in SQL client or run any other
cockroach
client commands (e.g.,cockroach node
), repeat step 2 using the appropriatecockroach
command.If you'd prefer to delete the pod and recreate it when needed, run
kubectl delete pod cockroachdb-client-secure
.Run some basic CockroachDB SQL statements:
> CREATE DATABASE bank;
> CREATE TABLE bank.accounts (id INT PRIMARY KEY, balance DECIMAL);
> INSERT INTO bank.accounts VALUES (1, 1000.50);
> SELECT * FROM bank.accounts;
id | balance +----+---------+ 1 | 1000.50 (1 row)
Create a user with a password:
> CREATE USER roach WITH PASSWORD 'Q7gc8rEdS';
You will need this username and password to access the DB Console later.
Exit the SQL shell and pod:
> \q
To use the built-in SQL client, you need to launch a pod that runs indefinitely with the cockroach
binary inside it, get a shell into the pod, and then start the built-in SQL client.
From your local workstation, use our
client-secure.yaml
file to launch a pod and keep it running indefinitely.Download the file:
$ curl -OOOOOOOOO \ https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/client-secure.yaml
In the file, change
serviceAccountName: cockroachdb
toserviceAccountName: my-release-cockroachdb
.Use the file to launch a pod and keep it running indefinitely:
$ kubectl create -f client-secure.yaml
pod "cockroachdb-client-secure" created
Get a shell into the pod and start the CockroachDB built-in SQL client:
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach sql \ --certs-dir=/cockroach-certs \ --host=my-release-cockroachdb-public
# Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # # Server version: CockroachDB CCL v20.2.0 (x86_64-unknown-linux-gnu, built 2020/07/29 22:56:36, go1.13.9) (same version as client) # Cluster ID: f82abd88-5d44-4493-9558-d6c75a3b80cc # # Enter \? for a brief introduction. # root@:26257/defaultdb>
Tip:This pod will continue running indefinitely, so any time you need to reopen the built-in SQL client or run any other
cockroach
client commands (e.g.,cockroach node
), repeat step 2 using the appropriatecockroach
command.If you'd prefer to delete the pod and recreate it when needed, run
kubectl delete pod cockroachdb-client-secure
.Run some basic CockroachDB SQL statements:
> CREATE DATABASE bank;
> CREATE TABLE bank.accounts (id INT PRIMARY KEY, balance DECIMAL);
> INSERT INTO bank.accounts VALUES (1, 1000.50);
> SELECT * FROM bank.accounts;
id | balance +----+---------+ 1 | 1000.50 (1 row)
Create a user with a password:
> CREATE USER roach WITH PASSWORD 'Q7gc8rEdS';
You will need this username and password to access the DB Console later.
Exit the SQL shell and pod:
> \q
Step 4. Access the DB Console
To access the cluster's DB Console:
On secure clusters, certain pages of the DB Console can only be accessed by
admin
users.Get a shell into the pod and start the CockroachDB built-in SQL client:
$ kubectl exec -it cockroachdb-2 \ -- ./cockroach sql \ --certs-dir cockroach-certs
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach sql \ --certs-dir=/cockroach-certs \ --host=cockroachdb-public
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach sql \ --certs-dir=/cockroach-certs \ --host=my-release-cockroachdb-public
Assign
roach
to theadmin
role (you only need to do this once):> GRANT admin TO roach;
Exit the SQL shell and pod:
> \q
In a new terminal window, port-forward from your local machine to the
cockroachdb-public
service:$ kubectl port-forward service/cockroachdb-public 8080
$ kubectl port-forward service/cockroachdb-public 8080
$ kubectl port-forward service/my-release-cockroachdb-public 8080
Forwarding from 127.0.0.1:8080 -> 8080
Note:Theport-forward
command must be run on the same machine as the web browser in which you want to view the DB Console. If you have been running these commands from a cloud instance or other non-local shell, you will not be able to view the UI without configuringkubectl
locally and running the aboveport-forward
command on your local machine.Go to https://localhost:8080 and log in with the username and password you created earlier.
Note:If you are using Google Chrome, and you are getting an error about not being able to reach
localhost
because its certificate has been revoked, go to chrome://flags/#allow-insecure-localhost, enable "Allow invalid certificates for resources loaded from localhost", and then restart the browser. Enabling this Chrome feature degrades security for all sites running onlocalhost
, not just CockroachDB's DB Console, so be sure to enable the feature only temporarily.In the UI, verify that the cluster is running as expected:
- View the Node List to ensure that all nodes successfully joined the cluster.
- Click the Databases tab on the left to verify that
bank
is listed.
Step 5. Simulate node failure
Based on the replicas: 3
line in the StatefulSet configuration, Kubernetes ensures that three pods/nodes are running at all times. When a pod/node fails, Kubernetes automatically creates another pod/node with the same network identity and persistent storage.
To see this in action:
Terminate one of the CockroachDB nodes:
$ kubectl delete pod cockroachdb-2
pod "cockroachdb-2" deleted
$ kubectl delete pod cockroachdb-2
pod "cockroachdb-2" deleted
$ kubectl delete pod my-release-cockroachdb-2
pod "my-release-cockroachdb-2" deleted
In the DB Console, the Cluster Overview will soon show one node as Suspect. As Kubernetes auto-restarts the node, watch how the node once again becomes healthy.
Back in the terminal, verify that the pod was automatically restarted:
$ kubectl get pod cockroachdb-2
NAME READY STATUS RESTARTS AGE cockroachdb-2 1/1 Running 0 12s
$ kubectl get pod cockroachdb-2
NAME READY STATUS RESTARTS AGE cockroachdb-2 1/1 Running 0 12s
$ kubectl get pod my-release-cockroachdb-2
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-2 1/1 Running 0 44s
Step 6. Add nodes
Your Kubernetes cluster includes 3 worker nodes, or instances, that can run pods. A CockroachDB node runs in each pod. As recommended in our production best practices, you should ensure that two pods are not placed on the same worker node.
Open and edit
example.yaml
.$ vi example.yaml
In
example.yaml
, update the number ofnodes
:nodes: 4
Note:Note that you must scale by updating the
nodes
value in the Operator configuration. Usingkubectl scale statefulset <cluster-name> --replicas=4
will result in new pods immediately being terminated.Apply
example.yaml
with the new configuration:$ kubectl apply -f example.yaml
Verify that the new pod started successfully:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 51m cockroachdb-1 1/1 Running 0 47m cockroachdb-2 1/1 Running 0 3m cockroachdb-3 1/1 Running 0 1m ...
Back in the DB Console, view the Node List to ensure that the fourth node successfully joined the cluster.
On a production deployment, first add a worker node, bringing the total from 3 to 4:
- On GKE, resize your cluster.
- On EKS, resize your Worker Node Group.
- On GCE, resize your Managed Instance Group.
- On AWS, resize your Auto Scaling Group.
Edit your StatefulSet configuration to add another pod for the new CockroachDB node:
$ kubectl scale statefulset cockroachdb --replicas=4
statefulset.apps/cockroachdb scaled
Verify that the new pod started successfully:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 51m cockroachdb-1 1/1 Running 0 47m cockroachdb-2 1/1 Running 0 3m cockroachdb-3 1/1 Running 0 1m cockroachdb-client-secure 1/1 Running 0 15m ...
Back in the DB Console, view the Node List to ensure that the fourth node successfully joined the cluster.
Edit your StatefulSet configuration to add another pod for the new CockroachDB node:
$ helm upgrade \ my-release \ cockroachdb/cockroachdb \ --set statefulset.replicas=4 \ --reuse-values
Release "my-release" has been upgraded. Happy Helming! LAST DEPLOYED: Tue May 14 14:06:43 2019 NAMESPACE: default STATUS: DEPLOYED RESOURCES: ==> v1beta1/PodDisruptionBudget NAME AGE my-release-cockroachdb-budget 51m ==> v1/Pod(related) NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 38m my-release-cockroachdb-1 1/1 Running 0 39m my-release-cockroachdb-2 1/1 Running 0 39m my-release-cockroachdb-3 0/1 Pending 0 0s my-release-cockroachdb-init-nwjkh 0/1 Completed 0 39m ...
Get the name of the
Pending
CSR for the new pod:$ kubectl get csr
NAME AGE REQUESTOR CONDITION default.client.root 1h system:serviceaccount:default:default Approved,Issued default.node.my-release-cockroachdb-0 1h system:serviceaccount:default:default Approved,Issued default.node.my-release-cockroachdb-1 1h system:serviceaccount:default:default Approved,Issued default.node.my-release-cockroachdb-2 1h system:serviceaccount:default:default Approved,Issued default.node.my-release-cockroachdb-3 2m system:serviceaccount:default:default Pending node-csr-0Xmb4UTVAWMEnUeGbW4KX1oL4XV_LADpkwjrPtQjlZ4 1h kubelet Approved,Issued node-csr-NiN8oDsLhxn0uwLTWa0RWpMUgJYnwcFxB984mwjjYsY 1h kubelet Approved,Issued node-csr-aU78SxyU69pDK57aj6txnevr7X-8M3XgX9mTK0Hso6o 1h kubelet Approved,Issued ...
If you do not see a
Pending
CSR, wait a minute and try again.Examine the CSR for the new pod:
$ kubectl describe csr default.node.my-release-cockroachdb-3
Name: default.node.my-release-cockroachdb-3 Labels: <none> Annotations: <none> CreationTimestamp: Thu, 09 Nov 2017 13:39:37 -0500 Requesting User: system:serviceaccount:default:default Status: Pending Subject: Common Name: node Serial Number: Organization: Cockroach Subject Alternative Names: DNS Names: localhost my-release-cockroachdb-1.my-release-cockroachdb.default.svc.cluster.local my-release-cockroachdb-1.my-release-cockroachdb my-release-cockroachdb-public my-release-cockroachdb-public.default.svc.cluster.local IP Addresses: 127.0.0.1 10.48.1.6 Events: <none>
If everything looks correct, approve the CSR for the new pod:
$ kubectl certificate approve default.node.my-release-cockroachdb-3
certificatesigningrequest.certificates.k8s.io/default.node.my-release-cockroachdb-3 approved
Verify that the new pod started successfully:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 51m my-release-cockroachdb-1 1/1 Running 0 47m my-release-cockroachdb-2 1/1 Running 0 3m my-release-cockroachdb-3 1/1 Running 0 1m cockroachdb-client-secure 1/1 Running 0 15m ...
Back in the DB Console, view the Node List to ensure that the fourth node successfully joined the cluster.
Step 7. Remove nodes
Before removing a node from your cluster, you must first decommission the node. This lets a node finish in-flight requests, rejects any new requests, and transfers all range replicas and range leases off the node.
If you remove nodes without first telling CockroachDB to decommission them, you may cause data or even cluster unavailability. For more details about how this works and what to consider before removing nodes, see Decommission Nodes.
Do not scale down to fewer than 3 nodes. This is considered an anti-pattern on CockroachDB and will cause errors.
Get a shell into one of the pods and use the
cockroach node status
command to get the internal IDs of nodes:$ kubectl exec -it cockroachdb-2 \ -- ./cockroach node status \ --certs-dir cockroach-certs
id | address | sql_address | build | started_at | updated_at | locality | is_available | is_live -----+-----------------------------------------+-----------------------------------------+---------+----------------------------------+----------------------------------+----------+--------------+---------- 1 | cockroachdb-0.cockroachdb.default:26257 | cockroachdb-0.cockroachdb.default:26257 | v20.1.4 | 2020-10-22 23:02:10.084425+00:00 | 2020-10-27 20:18:22.117115+00:00 | | true | true 2 | cockroachdb-1.cockroachdb.default:26257 | cockroachdb-1.cockroachdb.default:26257 | v20.1.4 | 2020-10-22 23:02:46.533911+00:00 | 2020-10-27 20:18:22.558333+00:00 | | true | true 3 | cockroachdb-2.cockroachdb.default:26257 | cockroachdb-2.cockroachdb.default:26257 | v20.1.4 | 2020-10-26 21:46:38.90803+00:00 | 2020-10-27 20:18:22.601021+00:00 | | true | true 4 | cockroachdb-3.cockroachdb.default:26257 | cockroachdb-3.cockroachdb.default:26257 | v20.1.4 | 2020-10-27 19:54:04.714241+00:00 | 2020-10-27 20:18:22.74559+00:00 | | true | true (4 rows)
Use the
cockroach node decommission
command to decommission the node with the highest number in its address (in this case, the address includingcockroachdb-3
):Note:It's important to decommission the node with the highest number in its address because, when you reduce the replica count, Kubernetes will remove the pod for that node.
$ kubectl exec -it cockroachdb-3 \ -- ./cockroach node decommission \ --self \ --certs-dir cockroach-certs \ --host=<address of node to decommission>
You'll then see the decommissioning status print to
stderr
as it changes:id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 73 | true | false (1 row)
Once the node has been fully decommissioned and stopped, you'll see a confirmation:
id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 0 | true | false (1 row) No more data reported on target nodes. Please verify cluster health before removing the nodes.
Once the node has been decommissioned, open and edit
example.yaml
.$ vi example.yaml
In
example.yaml
, update the number ofnodes
:nodes: 3
Apply
example.yaml
with the new configuration:$ kubectl apply -f example.yaml
The Operator will remove the node with the highest number in its address (in this case, the address including
cockroachdb-3
) from the cluster. It will also remove the persistent volume that was mounted to the pod.Verify that the pod was successfully removed:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 51m cockroachdb-1 1/1 Running 0 47m cockroachdb-2 1/1 Running 0 3m ...
Get a shell into the
cockroachdb-client-secure
pod you created earlier and use thecockroach node status
command to get the internal IDs of nodes:$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach node status \ --certs-dir=/cockroach-certs \ --host=cockroachdb-public
id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | cockroachdb-0.cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | cockroachdb-2.cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | cockroachdb-1.cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | cockroachdb-3.cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows)
The pod uses the
root
client certificate created earlier to initialize the cluster, so there's no CSR approval required.Note the ID of the node with the highest number in its address (in this case, the address including
cockroachdb-3
) and use thecockroach node decommission
command to decommission it:Note:It's important to decommission the node with the highest number in its address because, when you reduce the replica count, Kubernetes will remove the pod for that node.
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach node decommission <node ID> \ --certs-dir=/cockroach-certs \ --host=cockroachdb-public
You'll then see the decommissioning status print to
stderr
as it changes:id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 73 | true | false (1 row)
Once the node has been fully decommissioned and stopped, you'll see a confirmation:
id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 0 | true | false (1 row) No more data reported on target nodes. Please verify cluster health before removing the nodes.
Once the node has been decommissioned, scale down your StatefulSet:
$ kubectl scale statefulset cockroachdb --replicas=3
statefulset.apps/cockroachdb scaled
Verify that the pod was successfully removed:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 51m cockroachdb-1 1/1 Running 0 47m cockroachdb-2 1/1 Running 0 3m cockroachdb-client-secure 1/1 Running 0 15m ...
Get a shell into the
cockroachdb-client-secure
pod you created earlier and use thecockroach node status
command to get the internal IDs of nodes:$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach node status \ --certs-dir=/cockroach-certs \ --host=my-release-cockroachdb-public
id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | my-release-cockroachdb-0.my-release-cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | my-release-cockroachdb-2.my-release-cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | my-release-cockroachdb-1.my-release-cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | my-release-cockroachdb-3.my-release-cockroachdb.default.svc.cluster.local:26257 | v20.2.19 | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows)
The pod uses the
root
client certificate created earlier to initialize the cluster, so there's no CSR approval required.Note the ID of the node with the highest number in its address (in this case, the address including
cockroachdb-3
) and use thecockroach node decommission
command to decommission it:Note:It's important to decommission the node with the highest number in its address because, when you reduce the replica count, Kubernetes will remove the pod for that node.
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach node decommission <node ID> \ --certs-dir=/cockroach-certs \ --host=my-release-cockroachdb-public
You'll then see the decommissioning status print to
stderr
as it changes:id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 73 | true | false (1 row)
Once the node has been fully decommissioned and stopped, you'll see a confirmation:
id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 0 | true | false (1 row) No more data reported on target nodes. Please verify cluster health before removing the nodes.
Once the node has been decommissioned, scale down your StatefulSet:
$ helm upgrade \ my-release \ cockroachdb/cockroachdb \ --set statefulset.replicas=3 \ --reuse-values
Verify that the pod was successfully removed:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 51m my-release-cockroachdb-1 1/1 Running 0 47m my-release-cockroachdb-2 1/1 Running 0 3m cockroachdb-client-secure 1/1 Running 0 15m ...
You should also remove the persistent volume that was mounted to the pod. Get the persistent volume claims for the volumes:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE datadir-my-release-cockroachdb-0 Bound pvc-75dadd4c-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m datadir-my-release-cockroachdb-1 Bound pvc-75e143ca-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m datadir-my-release-cockroachdb-2 Bound pvc-75ef409a-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m datadir-my-release-cockroachdb-3 Bound pvc-75e561ba-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m
Verify that the PVC with the highest number in its name is no longer mounted to a pod:
$ kubectl describe pvc datadir-my-release-cockroachdb-3
Name: datadir-my-release-cockroachdb-3 ... Mounted By: <none>
Remove the persistent volume by deleting the PVC:
$ kubectl delete pvc datadir-my-release-cockroachdb-3
persistentvolumeclaim "datadir-my-release-cockroachdb-3" deleted
Step 8. Stop the cluster
If you plan to restart the cluster, use the
minikube stop
command. This shuts down the minikube virtual machine but preserves all the resources you created:$ minikube stop
Stopping local Kubernetes cluster... Machine stopped.
You can restore the cluster to its previous state with
minikube start
.If you do not plan to restart the cluster, use the
minikube delete
command. This shuts down and deletes the minikube virtual machine and all the resources you created, including persistent volumes:$ minikube delete
Deleting local Kubernetes cluster... Machine deleted.
Tip:To retain logs, copy them from each pod'sstderr
before deleting the cluster and all its resources. To access a pod's standard error stream, runkubectl logs <podname>
.
See also
Explore other core CockroachDB benefits and features:
- Replication & Rebalancing
- Fault Tolerance & Recovery
- Low Latency Multi-Region Deployment
- Serializable Transactions
- Cross-Cloud Migration
- Orchestration
- JSON Support
You might also want to learn how to orchestrate a production deployment of CockroachDB with Kubernetes.