Advanced Apigee Hybrid Cassandra Operations - Part 2

Overview

This article is co-authored by @greatdevaks and @sampriyadarshi.

This article is the second part of the Advanced Apigee Hybrid Cassandra Operations Blog Series which focuses on covering the nitty-gritties of Apigee Hybrid Cassandra Lifecycle Operations. We strongly recommend going through Part 1 first. In this article, we will dive deeper into Cassandra Database and look at some of its internals while also looking at some of the common troubleshooting techniques.

Key Terminologies

Before we dive deeper, let’s take a look at some of the key terminologies:

  • Node: A Node is a physical machine that runs the Cassandra software. In the case of Apigee Hybrid, a Node is hosted as a Statefulset Pod on a Worker Node (1 Pod on 1 Worker Node) of a Kubernetes Cluster (in the apigee-data Node Pool, specifically). Since Cassandra is a replicated database, at least three Nodes are required for the Production setup.
  • Datacenter: A Datacenter is a collection of Cassandra Nodes.
  • Cluster: A Cluster is a group of Datacenters/Regions in On-premises/Cloud.
  • Keyspace: A Keyspace is a Namespace for data in Cassandra.
  • Table: A Table is a collection of data in a Cassandra Keyspace.
  • Row: A Row is a single piece of data in a Cassandra Table.
  • Column: A Column is a field in a Cassandra Row.
  • Column Family: A Column Family is a collection of Cassandra Columns.
  • SSTable: An SSTable is a data structure that stores data in Cassandra. SSTables are immutable data files that help persist Cassandra data on disk. SSTables are streamed between the different Nodes; as an effect of this streaming, Cassandra could trigger compactions of the SSTables as well.
  • Replicas: Cassandra guarantees data durability by using the concept of Replicas. Replicas are multiple copies of data stored on different Cassandra Nodes. When using a Multi-Datacenter/Region setup, Replicas may be stored on different Datacenters/Regions.
  • Replication: Replication is the process of copying data to multiple Cassandra Nodes.
  • Availability: Availability is the property of a database that ensures that data is always available.
  • Consistency: Consistency is the property of a database that ensures that all data is up-to-date. Cassandra, being a distributed datastore, makes some guarantees about its scalability, availability, and reliability. Cassandra uses Eventual Consistency (check out CAP Theorem). Eventual Consistency is a property of a distributed datastore that guarantees that all updates to data will eventually be seen by all the Replicas. This is in contrast to Strong Consistency (not used by Cassandra), which guarantees that all Replicas will see the same data at the same time. Eventual Consistency is achieved by replicating data to multiple Nodes in a Cluster. When a user writes data to a Node, that Node updates its local copy of the data. The other Nodes in the Cluster then replicate the changes to their local copies. This process can take some time, but eventually all Nodes in the Cluster end up having the same data. Eventual Consistency is a trade-off between Consistency and Availability. Strong Consistency guarantees that all users will always see the same data, but it also means that the database system may not be able to handle high levels of write traffic. Eventual Consistency, on the other hand, allows the database system to handle high levels of write traffic, but it also means that users may not always see the latest data; Eventually Consistent databases prioritize Availability.
  • Gossip: Gossip is a peer-to-peer protocol that Cassandra Nodes use to communicate with each other.
  • Seed Node: A Seed Node is a Cassandra Node that other Cassandra Nodes use to find each other in a Cassandra Cluster.
  • Snitch: A Snitch is a component that determines the datacenter and the rack a Cassandra Node belongs to. Snitches help inform Cassandra about the network topology so that the requests are routed to the appropriate Cassandra Nodes. Replication Strategy(ies) use the information provided by Snitches to place the Replicas.
  • Repair: Repair is the process of fixing data that is corrupted or missing.
  • Maintenance: Maintenance is the process of performing regular tasks on a Cassandra Cluster, such as backups and repairs.

Now let’s dive a little deeper into the concepts like Replication Factor and Replication Strategies.

Cassandra Internals

Replication Factor

As hardware problems can occur or physical links can go down at any time in a datacenter, data processing operations can get affected. A solution is required to replicate copies of data across multiple Nodes in order to avoid data loss. Data replication is generally performed to ensure that no single point of failure is present in the system.

Cassandra, being a distributed database, places Replicas of data on different Nodes.

  • Replication Factor defines the total number of Replicas that can be placed across the different Nodes.
  • Replication Strategy helps determine the placement of Replicas.

Replication Factor of One (1) means that there is only a single copy of data in a Datacenter/Region, while Replication Factor of Three (3) means that there are three copies of data on three different Nodes in a Datacenter/Region.

For Production configuration, for ensuring that there is no single point of failure, it is recommended to have the Replication Factor of Three (3). In order for this configuration to work in Apigee Hybrid, the cassandra.replicaCount property should be set to 3 in the Apigee Hybrid overrides.yaml file.

The Replication Factor denotes the number of copies of each row of data that Cassandra stores. The Replication Factor is set for each Cassandra Keyspace, and the default Replication Factor is 3 for all the Cassandra Keyspaces. As a result, it is recommended to scale Cassandra Nodes by a factor of three. Ideally, the Cassandra Nodes should be distributed across three Availability Zones of a Region/Datacenter, so that the data is replicated in all those 3 Availability Zones.

Pro Tip: The Replication Factor should not exceed the number of Nodes in a Cluster. 

Replication Strategies

Every Cassandra Node in a Cluster is assigned one or more token range(s) for data in a continuous ring form. For example, if there are three Cassandra Nodes in a Cluster and the token range (hypothetical) is 0-59, the token range assignment for the Cassandra Nodes would look like:

  • Cassandra Node A: 0-19
  • Cassandra Node B: 20-39
  • Cassandra Node C: 40-59

When a request to write data to a Cassandra Node is made, Cassandra hashes the data to get a token value and then tries to place the hashed data on the Cassandra Node which has the appropriate token range to store the token.

When placing data on the Cassandra Nodes, Snitches and Replication Strategies are used.

There are two kinds of Replication Strategies in Cassandra, as explained below.

SimpleStrategy in Cassandra

SimpleStrategy is used when there is just one datacenter with one rack. It relies on SimpleSnitch which returns a list of all the Cassandra Nodes in a Cassandra Ring. When data is to be written to a Cassandra Node, SimpleStrategy attempts to place the data (first Replica, depending on the defined Replication Factor) on the Cassandra Node which has the appropriate token range which can fit the token value. After that, the remaining Replicas are placed in a clockwise direction (the Cassandra Node with the next higher token range is chosen on every such Replica placement attempt) in the Cassandra Ring.

NetworkTopologyStrategy

NetworkTopologyStrategy is a more complex rack-aware replication strategy that tries to avoid placement of two Replicas on the same rack in a datacenter. It uses PropertyFileSnitch which maintains information about which Cassandra Node belongs to which datacenter and rack. It places Replicas of each key/value pair on multiple Nodes, ensuring that there is at least one Replica of each key/value pair in each datacenter.

Apigee Hybrid uses NetworkTopologyStrategy.

The Replication Factor of the Apigee Hybrid Keyspaces can be seen to be set as 3 and NetworkTopologyStrategy being used, if a SELECT query on system_schema Keyspace is triggered, as shown in the image below.

greatdevaks_0-1688108131581.png

Cassandra Troubleshooting

It is essential to ensure that Apigee Hybrid Cassandra runs smoothly and efficiently all the time. Apigee Hybrid Cassandra, being a complex component, may encounter issues. Identifying the root causes for such issues can be a complex and time-consuming process.

Now that we understand how data gets replicated in Cassandra, let’s check out how to troubleshoot some of the common issues in Cassandra. 

When troubleshooting a Cassandra issue, it is important to have a clear understanding of the Cluster's configuration and how it is being used. It is also important to have a good understanding of the Cassandra logs, which can provide valuable information about the cause of the issue.

The following are some of the most commonly observed issues in Apigee Hybrid Cassandra:

  • Node Failures: Cassandra Nodes can fail for a variety of reasons, including hardware failures, software bugs, and network issues. When a Cassandra Node fails, it can impact the availability and performance of the entire Cluster.
  • Data Corruption: Data corruption can occur due to hardware failures, software bugs, network issues, and human error. Data corruption can cause Cassandra to return incorrect results or even crash. Because of data corruption, the Apigee Hybrid runtime and synchronizer pods may also not work properly, leading to an outage.
  • Performance Issues: Cassandra performance can be impacted by a number of factors, including the number of Nodes in the Cluster, the amount of data stored, and the type of queries being run. Performance issues can cause Cassandra to become unresponsive or slow.

The sub-sections below describe, with the help of examples, some of the most common troubleshooting techniques which can help troubleshoot Cassandra issues.

Hypothetical Scenario/Issue 1

Use-case reference: Official Documentation

Let’s assume that you had a dual-region/datacenter Apigee Hybrid setup which you modified to a single region/datacenter setup due to some business or technical reason. Now, you again want to have a dual-region/datacenter setup to establish proper HA/DR for Apigee Hybrid and are trying to expand your Apigee Hybrid setup from single region/datacenter to multi-region/datacenter. While performing the region/datacenter expansion, you see that the Cassandra Pods in the new region/datacenter are not coming up and are in CrashLoopBackOff state. You want to find the root cause of the issue and fix the same at the earliest.

How would you proceed? Check out the troubleshooting flow described below.

Checking Cassandra Error Logs

The following command can be used to check the Cassandra logs in case the Cassandra Pods are not in a healthy state or issues with Cassandra are suspected.

kubectl logs -n apigee -l app=apigee-cassandra -f

After running the above-mentioned command, say, it turned out that the Cassandra Pods were reporting the following error.

Exception (java.lang.RuntimeException) encountered during startup:
A node with address 10.52.18.40 already exists, cancelling join.
use cassandra.replace_addrees if you want to replace this node.

The next step should be to somehow get access to the Cassandra’s Cluster configuration and perform further debugging.

For collecting the logs from all the Namespaces of the Apigee Hybrid Runtime Cluster, the following command can be run.

kubectl cluster-info dump --output-directory logs_<directory_name> --all-namespaces --output yaml

Use Cassandra nodetool Utility

nodetool is a very useful utility which comes bundled with the Cassandra installation. It helps identify issues at Cassandra Node level and gives a lot of insights into the state of the Cassandra process itself.

Some of the most commonly used nodetool sub-commands from Apigee Hybrid Cassandra perspective are stated below.

  • nodetool describecluster: Prints the name, snitch, partitioner, and schema version of a Cassandra Cluster.
  • nodetool gcstats: Prints the GC Statistics.
  • nodetool netstats: Prints the network information on the provided host.
  • nodetool proxyhistograms: Prints the statistic histograms for network operations.
  • nodetool rebuild: Rebuilds data by streaming from other Nodes.
  • nodetool removenode: Removes the specified Node from the Cassandra Cluster.
  • nodetool repair: Repairs one or more tables.
  • nodetool status: Prints the cluster information.

Use the below-mentioned command to check the status of the Cassandra Nodes.

# check cassandra cluster status
kubectl -n apigee get pods \
-l app=apigee-cassandra \
--field-selector=status.phase=Running \
-o custom-columns=name:metadata.name --no-headers \
| xargs -I{} sh -c "echo {}; kubectl -n apigee exec {} -- nodetool -u <username> -pw  <password> status"

Let’s say that the above-mentioned command returned the following status.

greatdevaks_1-1688108131376.png

It can be seen that some stale records from the previously deleted Secondary Datacenter/Region are still there in the Cassandra Cluster which are causing the issue.

Debugging Cassandra using cqlsh

This section explains the use of cqlsh to debug issues with Cassandra. cqlsh can be used to query Cassandra Tables to extract useful information.

A client container can be used to run cqlsh commands for debugging. The steps to create a client container are described below.

Step 1: Fetch TLS Certificate Name

The client container uses the TLS Certificate from apigee-cassandra-user-setup Pod. In order to get the exact certificate name, the following command should be run.

kubectl get secrets -n apigee --field-selector type=kubernetes.io/tls | grep apigee-cassandra-user-setup | awk '{print $1}'

Step 2: Create the Pod Manifest File

Create a file named, say, cassandra-client.yaml to store the following cqlsh Pod specifications.

apiVersion: v1
kind: Pod
metadata:
  labels:
  name: my-cassandra-client   # For example: my-cassandra-client
  namespace: apigee
spec:
  containers:
  - name: my-cassandra-client
    image: "gcr.io/apigee-release/hybrid/apigee-hybrid-cassandra-client:1.9.3"    # For example, 1.9.3.
    imagePullPolicy: Always
    command:
    - sleep
    - "3600"
    env:
    - name: CASSANDRA_SEEDS
      value: apigee-cassandra-default.apigee.svc.cluster.local
    - name: APIGEE_DML_USER
      valueFrom:
        secretKeyRef:
          key: dml.user
          name: apigee-datastore-default-creds
    - name: APIGEE_DML_PASSWORD
      valueFrom:
        secretKeyRef:
          key: dml.password
          name: apigee-datastore-default-creds
    volumeMounts:
    - mountPath: /opt/apigee/ssl
      name: tls-volume
      readOnly: true
  volumes:
  - name: tls-volume
    secret:
      defaultMode: 420
      secretName: apigee-cassandra-user-setup-rg-hybrid-b7d3b9c-tls    # For example: apigee-cassandra-user-setup-rg-hybrid-b7d3b9c-tls
  restartPolicy: Never

Step 3: Apply the Pod Specifications

Apply the Pod Specifications to the target Kubernetes Cluster which is hosting the Apigee Hybrid Runtime Plane components.

kubectl apply -f cassandra-client.yaml -n apigee

Step 4: Exec into the Client Container

Exec into the client container in order to perform the debugging.

kubectl exec -n apigee cassandra-client -it -- bash

Step 5: Connect to the Cassandra cqlsh

Connect to the Cassandra cqlsh interface with the following command.

cqlsh ${CASSANDRA_SEEDS} -u ${APIGEE_DML_USER} -p ${APIGEE_DML_PASSWORD} --ssl

Once the connection to cqlsh has been made, queries can be triggered for performing the desired actions.

For the above-described hypothetical use-case, the following commands can be triggered for debugging and resolving the issue.

Trigger the below-mentioned query to get the Keyspace definitions.

select * from system_schema.keyspaces;

Let’s say the query resulted in the following output.

bash-4.4# cqlsh 10.50.112.194 -u <username> -p <password>  --ssl
Connected to apigeecluster at 10.50.112.194:9042.
[cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
admin_user@cqlsh> Select * from system_schema.keyspaces;
keyspace_name                        | durable_writes | replication
-------------------------------------+----------------+--------------------------------------------------------------------------------------------------
system_auth                          |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
kvm_tsg1_apigee_hybrid_prod_hybrid   |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
kms_tsg1_apigee_hybrid_prod_hybrid   |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
system_schema                        |           True |                                           {'class': 'org.apache.cassandra.locator.LocalStrategy'}
 system_distributed                  |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
system                               |           True |                                           {'class': 'org.apache.cassandra.locator.LocalStrategy'}
 perses                              |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
cache_tsg1_apigee_hybrid_prod_hybrid |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
rtc_tsg1_apigee_hybrid_prod_hybrid   |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
quota_tsg1_apigee_hybrid_prod_hybrid |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
system_traces                        |           True | {'Primary-DC1': '3', 'Secondary-DC2': '3', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}
(11 rows)

It can be seen from the output that references to Secondary-DC2 i.e. stale records are still there and these need to be removed in order to bring the setup to a clean state. The remaining cleanup process can be checked at the official documentation.

Playing with cqlsh

cqlsh can further be used to understand how data is stored in Cassandra for Apigee Hybrid.

One can describe the Keyspaces and Tables, as shown below.

greatdevaks_2-1688108131412.png

One can check the schema of a table, as shown below.

greatdevaks_3-1688108131352.png

 One can even look at the data inside a Table by using the SELECT statement. The following image shows data corresponding to KVMs.

greatdevaks_4-1688108131148.png

Hypothetical Scenario/Issue 2

Use-case reference: Official Documentation

Let’s assume that you were performing some Apigee Hybrid maintenance activity and you deleted the Cassandra workload. You are attempting to redeploy the Cassandra workload and the Cassandra Pods are resulting in CrashLoopBackOff state.

Checking Cassandra Error Logs

The Cassandra logs report some issue with the snitch’s datacenter differing from the previous datacenter.

Cannot start node if snitch's data center (us-east1) differs from previous data center

The potential root cause for the same could be some stale PVCs being present in the Cluster and being referenced by the Cassandra Pods. Check the official documentation for more details and how to resolve the issue.

Conclusion

In this article we looked at some of the internals of Cassandra and dived deeper into the Apigee Cassandra troubleshooting techniques (these techniques are good for getting started, however, there are more advanced and involved techniques which we will cover in the future parts of this series). The article took two examples and highlighted where the knowledge related to Cassandra internals can be applied.

The next part of this Advanced Apigee Hybrid Cassandra Operations Blog Series will cover the CSI Backup/Restore Prodecure for Apigee Hybrid Cassandra. Stay tuned 🙂

Advanced Apigee Hybrid Cassandra Operations Series

Version history
Last update:
‎06-30-2023 12:27 AM
Updated by: