hero-FG-blog

Kasten K10 Blog

All Things Kubernetes and Data Management

  Latest Posts

Perform a Complex Restore of a Blockchain Application in Kubernetes with Kasten K10

When the piano was invented, musicians noticed that it had become much easier to produce a sound with the instrument. But pianists were expected to play many more sounds than a classic instrument. Complexity was just moving to another place. 

Kubernetes makes application deployment easier, yet application features require an increasing number of components and requirements. A good example of this is the IBM blockchain network, in which the trust on the transaction can be distributed between organizations. image11

Four organizations -- R1, R2, R3 and R4 -- have jointly decided and entered an agreement to set up and exploit a Hyperledger Fabric network. 

Without entering all the details, this application is complex to backup and restore. It creates  : 

  1. A least 11 PVCs that must be backed up and restored in the right order.
  2. More than 12 deployments if we include the client application.
  3. Multiple secrets, configmaps, service accounts, services and routes. 
  4. Multiple IBP custom resources managed by an operator that creates the deployment and PVCs.

IBM features an IBP console that helps with deploying peers, oredererers and ca nodes on the network. It also helps to manage identities and channels, and visualize the transaction on the ledger. This UI is very handy, but  constantly creates new PVCs, deployments and route services as the network grows, all of which must be backed up regularly. Additionally, the operator must also be managed by the Lifecycle Operator on OpenShift, which creates even more complexity.

Without a tool such as Kasten K10 by Veeam to protect and restore such applications, backup and restore processes are challenging. In this post, we’ll demonstrate the power of Kasten K10 using this scenario as an example.

Assumptions 

In this tutorial, we assume that you have installed Kasten on OpenShift.  (If that’s not the case, read this blog post.)

My base domain on OpenShift will be michael-1.aws.kasten.io.  

We will restore the application without the Operator Lifecycle Manager, to demonstrate restoration is possible on a cluster that does not exist on vanilla Kubernetes.

Step 1: Create a Blockchain Network

The first step is to create the blockchain namespace:

oc new-project my-blockchain

image2

image4

Now, we’ll go to the Operator Hub. In the search bar, type “blockchain” to load the blockchain tile.

Step 2: Apply the Security Context Constraint

The next step is to copy and save the following security context constraint object to the local system as ibp-scc.yaml:

cat <<EOF | oc apply -f -
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: true
allowPrivilegedContainer: true
allowedCapabilities:
- NET_BIND_SERVICE
- CHOWN
- DAC_OVERRIDE
- SETGID
- SETUID
- FOWNER
apiVersion: security.openshift.io/v1
defaultAddCapabilities: []
fsGroup:
  type: RunAsAny
groups:
- system:serviceaccounts:my-blockchain
kind: SecurityContextConstraints
metadata:
  name: my-blockchain
readOnlyRootFilesystem: false
requiredDropCapabilities: []
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
supplementalGroups:
  type: RunAsAny
volumes:
- "*"
EOF

Then, we will run the following commands to add the file to the cluster, and add the constraint to the project:

oc adm policy add-scc-to-user my-blockchain system:serviceaccounts:my-blockchain

When the command is successful, we will see a response that is similar to the following example:

securitycontextconstraints.security.openshift.io/my-blockchain created
scc "blockchain-project" added to: ["system:serviceaccounts:my-blockchain"]

Step 3: Deploy the IBM Blockchain Platform console

There are four instances available listed under "Provided APIs":

image20

  • IBP CA (Advanced users): Deploys an instance of an IBM Blockchain Platform CA.
  • IBP Console: The IBM Blockchain Platform console UI, or "console", is an award-winning user interface for building your blockchain network.
  • IBP Orderer (Advanced users): Deploys an instance of an IBM Blockchain Platform ordering service.
  • IBP Peer (Advanced users): Deploys an instance of an IBM Blockchain Platform peer.

Step 4: Create the Instance on the IBPConsole Tile

image18

apiVersion: ibp.com/v1beta1
kind: IBPConsole
metadata:
  name: ibpconsole
  namespace: my-blockchain
  labels:
    app.kubernetes.io/name: "ibp"
    app.kubernetes.io/instance: "ibp"
    app.kubernetes.io/managed-by: "ibm-ibp"
spec:
  email: michael@kasten.io
  password: ultrasecurepassword
  imagePullSecrets:
  - regcred
  registryURL: cp.icr.io/cp
  license:
    accept: true
  networkinfo:
    domain: apps.michael-1.aws.kasten.io
  storage:
    console:
      class: ''
      size: 5Gi
  serviceAccountName: ibm-blockchain
  version: 2.5.1

 

See https://cloud.ibm.com/docs/blockchain-sw-251?topic=blockchain-sw-251-deploy-ocp-rhm#console-deploy-ocp-rhm-advanced for more advanced options.

Step 5: Verify the Console Installation and Log In

oc get deployment -n my-blockchain
NAME           READY   UP-TO-DATE   AVAILABLE   AGE
ibp-operator   1/1     1            1           34m
ibpconsole     1/1     1            1           16m

All deployment are in the ready state:

oc get po 
NAME                            READY   STATUS    RESTARTS   AGE
ibp-operator-7f896d6644-sh2mx   1/1     Running   0          36m
ibpconsole-6f94ddc6f9-t9khx     4/4     Running   0          18m
oc get route
NAME                 HOST/PORT                                                      
PATH   SERVICES     PORT      TERMINATION   WILDCARD
ibpconsole-console  
my-blockchain-ibpconsole-console.apps.michael-1.aws.kasten.io         
ibpconsole   optools   passthrough   None
ibpconsole-proxy    
my-blockchain-ibpconsole-proxy.apps.michael-1.aws.kasten.io           
ibpconsole   optools   passthrough   None

 

We connect to https://my-blockchain-ibpconsole-console.apps.michael-1.aws.kasten.io 

and use our email and password provided in the ibpconsole michael@kasten.io and ultrasecurepassword. We then change the password as requested by the UI and reconnect:

image22

Step 6: Create a Blockchain Network

 

We will create a minimal network for testing purposes following this tutorial 

https://cloud.ibm.com/docs/blockchain-sw-251?topic=blockchain-sw-251-ibp-console-build-network#ibp-console-build-network.

At this point, we can run backup and recovery on the  first two blocks:

image23

But it’s more interesting to have transactions to make sure we are able to retrieve them upon restore.

We can use this tutorial to create a nodejs basic smartcontract 

https://cloud.ibm.com/docs/blockchain-sw-251?topic=blockchain-sw-251-ibp-console-smart-contracts-v14 

Here’s a summary of the steps: 

  • First we install the VS Code extension for the IBM Blockchain Platform. 
  • Next, we follow the basic tutorial in VS Code to create a smartcontract in nodejs call js-contract.
  • Then we export the contract in .cds format, which is compatible with the 1.4-compatible channel created in the previous steps.
  • Finally, we install and instantiate the js-contract on channel 1. 

image13

All these operations create some transactions on the ledger that we can use now to test the restoration process:

image10image8image17

It could also be interesting to connect from VS Code to the platform and create some transactions, but this is beyond the scope of this blog post.

What’s Available for Backup and Restore?

At this point, we have these pods:

oc get po 
NAME                                                       READY   STATUS    RESTARTS   AGE
chaincode-execution-8d91fc09-df12-4c91-8b1b-6239b7947e23   1/1     Running   0          64m
ibp-operator-7f896d6644-sh2mx                              1/1     Running   0          22h
ibpconsole-6f94ddc6f9-t9khx                                4/4     Running   0          22h
orderingserviceca-b64685fc9-2q6dl                          1/1     Running   0          15h
orderingservicenode1-f7bb5f9f-2w29b                        2/2     Running   0          14h
org1ca-7fd68c8cc-ggvpq                                     1/1     Running   0          16h
peerorg1-58b48f49f9-mwp2j                                  4/4     Running   0          15h

We also have these deployments:

oc get deployment
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
ibp-operator           1/1     1            1           22h
ibpconsole             1/1     1            1           22h
orderingserviceca      1/1     1            1           15h
orderingservicenode1   1/1     1            1           14h
org1ca                 1/1     1            1           16h
peerorg1               1/1     1            1           15h

Note that the chaincode-execution pod is not linked to any deployment. 

Here are the PVCs...

oc get pvc       
NAME                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ibpconsole-pvc             Bound    pvc-07638d8f-1e9d-4fb8-a4d7-5dc3a7a05840   5Gi        RWO            gp2-csi        22h
orderingserviceca-pvc      Bound    pvc-b71c1051-d538-4eff-9e0f-937497125dec   20Gi       RWO            gp2-csi        15h
orderingservicenode1-pvc   Bound    pvc-3748a0cd-995e-4701-bbb9-484634345b5f   100Gi      RWO            gp2-csi        14h
org1ca-pvc                 Bound    pvc-d46a10b3-f9d7-4bbb-8a71-d4574aa05baa   20Gi       RWO            gp2-csi        16h
peerorg1-pvc               Bound    pvc-e5575337-064f-4e0c-8e8f-1745655b8529   100Gi      RWO            gp2-csi        15h
peerorg1-statedb-pvc       Bound    pvc-e18e0783-8608-4668-9640-7f411726c6bd   10Gi       RWO            gp2-csi        15h

...and the corresponding CRD mapping to ibp objects:

oc get ibpca,ibpconsole,ibporderer,ibppeer
NAME                              READY
ibpca.ibp.com/orderingserviceca   
ibpca.ibp.com/org1ca              

NAME                            READY
ibpconsole.ibp.com/ibpconsole  

NAME                                      READY
ibporderer.ibp.com/orderingservice        
ibporderer.ibp.com/orderingservicenode1   

NAME                       READY
ibppeer.ibp.com/peerorg1 

Here are  the Operator Lyfecycle Manager elements provided to install the global blockchain application:

oc get operatorgroup,subscription,csv 
NAME                                                     AGE
operatorgroup.operators.coreos.com/my-blockchain-jh4jr   23h

NAME                                               PACKAGE          SOURCE                 CHANNEL
subscription.operators.coreos.com/ibm-blockchain   ibm-blockchain   ibm-operator-catalog   v2.5

NAME                                                 DISPLAY          VERSION   REPLACES   PHASE
clusterserviceversion.operators.coreos.com/ibm-blockchain.v2.5.1   IBM Blockchain   2.5.1     Succeeded

The Operator Lifecycle Manager created the clusterrole and clusterrolebinding during install, to let the ibp-operator manage pods, deployment and IBP custom resources. These need to be captured at cluster resource scope level:

oc get clusterrole | grep ibp                    
ibpcas.ibp.com-v1beta1-admin                                                2021-03-29T14:30:16Z
ibpcas.ibp.com-v1beta1-crdview                                              2021-03-29T14:30:17Z
ibpcas.ibp.com-v1beta1-edit                                                 2021-03-29T14:30:16Z
ibpcas.ibp.com-v1beta1-view                                                 2021-03-29T14:30:16Z
ibpconsoles.ibp.com-v1beta1-admin                                           2021-03-29T14:30:17Z
ibpconsoles.ibp.com-v1beta1-crdview                                         2021-03-29T14:30:17Z
ibpconsoles.ibp.com-v1beta1-edit                                            2021-03-29T14:30:17Z
ibpconsoles.ibp.com-v1beta1-view                                            2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-admin                                           2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-crdview                                         2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-edit                                            2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-view                                            2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-admin                                              2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-crdview                                            2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-edit                                               2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-view                                               2021-03-29T14:30:17Z

oc get clusterrole | grep ibm 
ibm-blockchain.v2.5.2-6bdc5f6d8

oc get clusterrolebinding | grep ibm 
ibm-blockchain.v2.5.2-6bdc5f6d8

Restoration Constraints 

1. Topology constraint 

To accomplish high resiliency, the blockchain operator decides on a very precise topology breakdown of the pods around zones. When needed, the operator can create a zone label on the PVC that corresponds to the zone of the actual PVC: 

oc get pvc -L zone                                       
NAME                       STATUS   VOLUME                                     ZONE
ibpconsole-pvc             Bound    pvc-a08bd96f-bc41-432f-ac96-26c4850b81ce   
orderingserviceca-pvc      Bound    pvc-f3ca0d28-012a-4a5a-a793-474045bac3ce   eu-west-1a
orderingservicenode1-pvc   Bound    pvc-161921ca-3fcf-490a-b49b-5ab88ee6bc41   
org1ca-pvc                 Bound    pvc-308f93b1-eb45-439a-8808-1e3d79550b46   eu-west-1a
peerorg1-pvc               Bound    pvc-d0318054-af5f-44a4-b49c-7f1cd5d4e56e   eu-west-1b
peerorg1-statedb-pvc       Bound    pvc-2ac27a9c-a571-488a-a28e-d38f90054a42   eu-west-1b

To reinforce this, an affinity rule is also set up on the deployment itself:

oc get deploy peerorg1 -o jsonpath='{.spec.template.spec.affinity}' | jq
{
  "nodeAffinity": {
    "requiredDuringSchedulingIgnoredDuringExecution": {
      "nodeSelectorTerms": [
        {
          "matchExpressions": [
            {
              "key": "topology.kubernetes.io/zone",
              "operator": "In",
              "values": [
                "eu-west-1b"
              ]
            },
            {
              "key": "topology.kubernetes.io/region",
              "operator": "In",
              "values": [
                "eu-west-1"
              ]
            }
          ]
        },
        {
          "matchExpressions": [
            {
              "key": "failure-domain.beta.kubernetes.io/zone",
              "operator": "In",
              "values": [
                "eu-west-1b"
              ]
            },
            {
              "key": "failure-domain.beta.kubernetes.io/region",
              "operator": "In",
              "values": [
                "eu-west-1"
              ]
            }
          ]
        }
      ]
    }
  },
  "podAntiAffinity": {
    "preferredDuringSchedulingIgnoredDuringExecution": [
      {
        "podAffinityTerm": {
          "labelSelector": {
            "matchExpressions": [
              {
                "key": "orgname",
                "operator": "In",
                "values": [
                  "org1msp"
                ]
              }
            ]
          },
          "topologyKey": "kubernetes.io/hostname"
        },
        "weight": 100
      }
    ]
  }
}

That means that when restoring the PVI, we must make sure we restore it to the right zone.

2. Data Consistency Constraint 

To understand the data consistency constraint, we must first understand the transaction flow:

image19

Every peer pod holds 2 PVCs: 

  • The ledger PVC (100Gi), or the materialised blockchain, made of block and transactions inside the blocks that have been validated. 
  • The couchdb PVC (10Gi) that represents the state of the ledger with not yet validated transactions. There are also some transactions validated, but not yet committed to the ledger. 

It’s important to make sure that the couchdb PVC does not have validated transactions that the ledger does not have. IBM recommends taking a snapshot of the couchdb PVC at 3:00 a.m. and a snapshot of the ledger PVC at 3:05 a.m.

Every orderer pod has a ledger PVC, but contrary to a public blockchain that uses proof of work to validate the blockchain, private blockchains use orderers to order the validated block.

We must be sure that peers do not have validated transactions that the orderer does not have. This time, we are not speaking about consistency inside a pod, but consistency inside a wide network. IBM recommends taking a snapshot of the PVC orderer at 5:00 a.m. (2 hours after the snapshot of the peers).  

There are also CA pods and an IBP Console pod with a single PVC. These should be backed up each time the network topology changes. To make it simpler, let’s back them up at 5:00 a.m. with the orderer.

Here’s a summary of the scheduled backups:

Peer Couchdb PVC 3:00 a.m.
Peer Ledger PVC 3:05 a.m.
Orderer Ledger PVC 5:00 a.m.
CA and IBPConsole PVC 5:00 a.m.

 

Implementing the Backup Strategy 


We can implement the backup strategy easily by creating: 

  1. One daily policy that backs up the entire namespace at 3:00 a.m. and 5:00 a.m.
  2. One daily policy that backs up the entire namespace at 3:05 a.m.
  3. One weekly cluster resource policy (to capture clusterrole and clusterrolebinding).
image9

image12image6

 

With this schedule, we have everything we need to restore the application consistently.

Restoring the Application After a Disaster

Let’s remove the my-blockchain application to emulate the disaster:

oc delete ns my-blockchain

Step 1: Recreate the PVC from the Different Restore Points

First, we’ll recreate the empty namespace: 

oc create ns my-blockchain

We’ll  use the partial restore capacity of Kasten K10 to pick up the different PVCs for the various  restore points:

Restorepoint PVC
3:00 peerorg1-pvc
3:05 peerorg1-statedb-pvc
5:00 orderingservicenode1-pvc
 

The console pvc and the rest of CA pvc 

ibpconsole-pvc

orderingserviceca-pvc

org1ca-pvc

 

This image shows how to restore only the peerorg1-pvc from the 3:00 a.m. restore point:

image14

Step 2: ReCreate the PVC in the Right Zone 

Before launching the restore, we need to apply a transform to make sure the PVC is recreated in the right zone. When the blockchain operator defines precisely in which zone the PVC has been created, it add a “zone” label to the PVC and an affinity constraint to the deployment using this PVC: 

oc get pvc peerorg1-pvc -o yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    ...
    zone: eu-west-1b
  name: peerorg1-pvc
...
oc get deploy peerorg1 -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  ...
  name: peerorg1
  namespace: my-blockchain
  ….
spec:
  ....
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - eu-west-1b
              - key: topology.kubernetes.io/region
                operator: In
                values:
                - eu-west-1
            - matchExpressions:
              - key: failure-domain.beta.kubernetes.io/zone
                operator: In
                values:
                - eu-west-1b
              - key: failure-domain.beta.kubernetes.io/region
                operator: In
                values:
                - eu-west-1
...

We will use this label to recreate the PVC in the right zone. Then, we’ll  create a storageclass per zone:

oc get sc eu-west-1a -o yaml 
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: topology.ebs.csi.aws.com/zone
    values:
    - eu-west-1a
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: eu-west-1a  
parameters:
  encrypted: "true"
  type: gp2
provisioner: ebs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
...

We’ll use a transform to copy the zone name in the storageClassName in the PVC spec:

image21

 

Then, we’ll apply this transform for each of the three restore points. 

Step 3: Restore the Rest of the namespace (with Some Exclusions)

Now we have the namespace with only the PVCs. We must bring back the rest of the namespace, but without: 

  • The PVCs that we already got back 
  • The chaincode pods (they we’ll be recreated at smartcontract invoke)
  • The configmap ibp-operator-lock
  • The elements of the Operator Lifecycle Manager 
    • ClusterServiceVersion 
    • Subscription
    • OperatorGroup
    • InstallPlan

image16

We also need to scale down all the deployments. This is because Kasten K10 will restore the CRD after all deployments are up and running. However, if those deployments rely on this CRD to work, it’s best to scale down everything so that Kasten K10 considers everything successful and moves to CRD restoration. We’ll scale up afterwards.

image1

Step 4: Restore the clusterrole and clusterrolebinding from the Cluster Restore Point

When we deleted the namespace, the Operator Lifecycle Manager automatically deleted the clusterrole and clusterrolebinding created for the IBM Blockchain operator: 

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: "2021-03-31T23:09:32Z"
  labels:
    olm.owner: ibm-blockchain.v2.5.1
    olm.owner.kind: ClusterServiceVersion
    olm.owner.namespace: my-blockchain
    operators.coreos.com/ibm-blockchain.my-blockchain: ""
  name: ibm-blockchain.v2.5.1-6bdc5f6d8
rules:
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - persistentvolumeclaims
  - persistentvolumes
  verbs:
  - '*'
...

The olm.owner label has OLM to remove this object if the ClusterServiceVersion is removed, and we decide to restore without the OLM objects. To achieve our restore, we need to remove the labels section. 

We retrieve the clusterrole and clusterrolebinding from the cluster restore point and apply a transform to remove the labels section: 

image3image15

If we were restoring in a new cluster, we would have to restore  the other my-blockhain clusterrole and rolebinding created by the deployment of the operator by OLM, as well: 

image7

Restart the Deployment 

We can now restart the deployment and check that everything is recovered and that all the pods are back:

for dep in $(oc get deploy -o name); do oc scale $dep --replicas=1; done
oc get pods
NAME                                    READY   STATUS    RESTARTS   AGE
ibp-operator-7dd4bfb76f-s89rx           1/1     Running   0          2d8h
ibpconsole-cbf76d57d-2xwx9              4/4     Running   0          2d8h
orderingserviceca-78fb95747c-dvk4k      1/1     Running   0          2d8h
orderingservicenode1-646dd846bc-94fwm   2/2     Running   0          2d8h
org1ca-75845db8bf-ts6tf                 1/1     Running   0          2d8h
peerorg1-7758946784-kqcrp               4/4     Running   0          2d8h

The best way to check that your data is recovered is to check the block transactions:

image5

We can verify that we got back all the blocks and the transaction. We were also able to  restart the whole blockchain application, within the data consistency and topology constraints.

Conclusion 

With this article, we’ve shown how you can more easily perform a complex backup and restoration process with Kasten K10. Kasten has all the necessary features to make your backup and restore possible, as long as you understand how your application works.

Try Kasten K10 for free today. 

 

Michael Courcy

I started my career as a solutions architect, focused mainly on JAVA/JEE for government projects. Now I work as a DevOps architect, building cloud native solutions based on Kubernetes and the main cloud providers like AWS, Azure, and many more.


Share

Recent Blog Posts