Skip to content

Create an Agent cluster

This document explains how to create HostedClusters and NodePools using the Agent platform.

The Agent platform uses the Infrastructure Operator (AKA Assisted Installer) to add worker nodes to a hosted cluster. For a primer on the Infrastructure Operator, see here.

Overview

When you create a HostedCluster with the Agent platform, HyperShift will install the Agent CAPI provider in the Hosted Control Plane (HCP) namespace.

Upon scaling up a NodePool, a Machine will be created, and the CAPI provider will find a suitable Agent to match this Machine. Suitable means that the Agent is approved, is passing validations, is not currently bound (in use), and has the requirements specified on the NodePool Spec (e.g., minimum CPU/RAM, labels matching the label selector). You may monitor the installation of an Agent by checking its Status and Conditions.

Upon scaling down a NodePool, Agents will be unbound from the corresponding cluster. However, you must boot them with the Discovery Image once again before reusing them.

Install Hypershift Operator

Follow instructions for building the hypershift CLI in Getting Started

Install the Hypershift Operator

hypershift install

Install Assisted Service and Hive Operators

NOTE: If Red Hat Advanced Cluster Management (RHACM) is already installed, this can be skipped as the Infrastructure Operator and Hive Operator are dependencies of RHACM.

We will leverage tasty to deploy the required operators easily.

Install tasty:

curl -s -L https://github.com/karmab/tasty/releases/download/v0.4.0/tasty-linux-amd64 > ./tasty
sudo install -m 0755 -o root -g root ./tasty /usr/local/bin/tasty

Install the operators

tasty install assisted-service-operator hive-operator

Configure Agent Service

Create the AgentServiceConfig resource

export DB_VOLUME_SIZE="10Gi"
export FS_VOLUME_SIZE="10Gi"
export OCP_VERSION="4.10.16"
export OCP_MAJMIN=${OCP_VERSION%.*}
export ARCH="x86_64"
export OCP_RELEASE_VERSION=$(curl -s https://mirror.openshift.com/pub/openshift-v4/${ARCH}/clients/ocp/${OCP_VERSION}/release.txt | awk '/machine-os / { print $2 }')
export ISO_URL="https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/${OCP_MAJMIN}/${OCP_VERSION}/rhcos-${OCP_VERSION}-${ARCH}-live.${ARCH}.iso"
export ROOT_FS_URL="https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/${OCP_MAJMIN}/${OCP_VERSION}/rhcos-${OCP_VERSION}-${ARCH}-live-rootfs.${ARCH}.img"

envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
 name: agent
spec:
  databaseStorage:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: ${DB_VOLUME_SIZE}
  filesystemStorage:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: ${FS_VOLUME_SIZE}
  osImages:
    - openshiftVersion: "${OCP_VERSION}"
      version: "${OCP_RELEASE_VERSION}"
      url: "${ISO_URL}"
      rootFSUrl: "${ROOT_FS_URL}"
      cpuArchitecture: "${ARCH}"
EOF

Configure DNS

The API Server for the Hosted Cluster is exposed a Service of type NodePort.

A DNS entry must exist for api.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN} pointing to destination where the API Server can be reached.

This can be as simple as a A record pointing to one of the nodes in the management cluster (i.e. the cluster running the HCP). It can also point to a load balancer deployed to redirect incoming traffic to the ingress pods.

Create a Hosted Cluster

export CLUSTERS_NAMESPACE="clusters"
export HOSTED_CLUSTER_NAME="example"
export HOSTED_CONTROL_PLANE_NAMESPACE="${CLUSTERS_NAMESPACE}-${HOSTED_CLUSTER_NAME}"
export BASEDOMAIN="krnl.es"
export PULL_SECRET_FILE=$PWD/pull-secret

# Typically the namespace is created by the hypershift-operator 
# but agent cluster creation generates a capi-provider role that
# needs the namespace to already exist
oc create ns ${HOSTED_CONTROL_PLANE_NAMESPACE}
bin/hypershift create cluster agent --name=${HOSTED_CLUSTER_NAME} --pull-secret=${PULL_SECRET_FILE} --agent-namespace=${HOSTED_CONTROL_PLANE_NAMESPACE} --api-server-address=api.${HOSTED_CLUSTER_NAME}.${BASEDOMAIN}

Create a InfraEnv

An InfraEnv is a enviroment to which hosts booting the live ISO can join as Agents. In this case, the Agents will be created in the same namespace as our HCP

export SSH_PUB_KEY=$(cat $HOME/.ssh/id_rsa.pub)

envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
  name: ${HOSTED_CLUSTER_NAME}
  namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
spec:
  pullSecretRef:
    name: pull-secret
  sshAuthorizedKey: ${SSH_PUB_KEY}
EOF

This will generate a live ISO that allows machines (VMs or bare meatal) to join as Agents

oc get InfraEnv ${HOSTED_CLUSTER_NAME} -ojsonpath="{.status.isoDownloadURL}"

Adding Agents

You can add Agents by manually configuring the machine to boot with the live ISO or by using metal3

Manual

The live ISO may be downloaded and used to boot a node (bare metal or VM). On boot, the node will communicate with the assisted-service and register as an Agent in the the same namespace as the InfraEnv.

metal3

We will leverage the Assisted Service and Hive to create the custom ISO as well as the Baremetal Operator to perform the installation.

  • Enable the Baremetal Operator to watch all namespaces as the baremetalhost object for the hosted cluster will be created in the ${HOSTED_CONTROL_PLANE_NAMESPACE} namespace:
oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true }}'

NOTE: This will trigger a restart of the metal3 pod in the openshift-machine-api namespace.

  • Wait until the metal3 pod is ready again:
until oc wait -n openshift-machine-api $(oc get pods -n openshift-machine-api -l baremetal.openshift.io/cluster-baremetal-operator=metal3-state -o name) --for condition=containersready --timeout 10s >/dev/null 2>&1 ; do sleep 1 ; done
  • Set the variables required for the BMC details of the worker that is going to be added:
export BMC_USERNAME=$(echo -n "root" | base64 -w0)
export BMC_PASSWORD=$(echo -n "calvin" | base64 -w0)
export BMC_IP="192.168.124.228"
export WORKER_NAME="ocp-worker-0"
export BOOT_MAC_ADDRESS="aa:bb:cc:dd:ee:ff"
export UUID=11111111-1111-1111-1111-111111111111
export REDFISH="redfish-virtualmedia+http://${BMC_IP}:8000/redfish/v1/Systems/${UUID}"
  • Create the BMC secret to host the BMC user and password:
envsubst <<"EOF" | oc apply -f -
apiVersion: v1
data:
  password: ${BMC_PASSWORD}
  username: ${BMC_USERNAME}
kind: Secret
metadata:
  name: ${WORKER_NAME}-bmc-secret
  namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
type: Opaque
EOF
  • Create the BMH object:
envsubst <<"EOF" | oc apply -f -
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: ${WORKER_NAME}
  namespace: ${HOSTED_CONTROL_PLANE_NAMESPACE}
  labels:
    infraenvs.agent-install.openshift.io: ${HOSTED_CLUSTER_NAME}
  annotations:
    inspect.metal3.io: disabled
spec:
  automatedCleaningMode: disabled
  bmc:
    disableCertificateVerification: True
    address: ${REDFISH}
    credentialsName: ${WORKER_NAME}-bmc-secret
  bootMACAddress: ${BOOT_MAC_ADDRESS}
  online: true
EOF

Configuring Agents

Once the Agents are created, approve them and set their installation_disk_id, hostname, and role

$ oc get agents -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
NAME                                   CLUSTER   APPROVED   ROLE          STAGE
86f7ac75-4fc4-4b36-8130-40fa12602218                        auto-assign
e57a637f-745b-496e-971d-1abbf03341ba                        auto-assign

$ oc patch agent 86f7ac75-4fc4-4b36-8130-40fa12602218 -p '{"spec":{"installation_disk_id":"/dev/sda","approved":true,"hostname":"worker-0.example.krnl.es","role":"worker"}}' --type merge
$ oc patch agent 23d0c614-2caa-43f5-b7d3-0b3564688baa -p '{"spec":{"installation_disk_id":"/dev/sda","approved":true,"hostname":"worker-1.example.krnl.es","role":"worker"}}' --type merge

$ oc get agents -n ${HOSTED_CONTROL_PLANE_NAMESPACE}
NAME                                   CLUSTER   APPROVED   ROLE          STAGE
86f7ac75-4fc4-4b36-8130-40fa12602218             true       worker
e57a637f-745b-496e-971d-1abbf03341ba             true       worker

Scale the NodePool

Scale the NodePool to two nodes

oc scale NodePool -n ${CLUSTERS_NAMESPACE} ${HOSTED_CLUSTER_NAME} --replicas=2

Verify the Agents are assigned to the hosted cluster

$ oc get agents
NAME                                   CLUSTER   APPROVED   ROLE     STAGE
86f7ac75-4fc4-4b36-8130-40fa12602218   example   true       worker   Done
e57a637f-745b-496e-971d-1abbf03341ba   example   true       worker   Done

Verify machines joined the hosted cluster as Nodes

$ oc get machines
NAME                      CLUSTER         NODENAME                   PROVIDERID                                     PHASE     AGE     VERSION
example-bcc6c5c95-f6shj   example-4z9kg   worker-0.example.krnl.es   agent://e57a637f-745b-496e-971d-1abbf03341ba   Running   3h21m   4.10.18
example-bcc6c5c95-jskr8   example-4z9kg   worker-1.example.krnl.es   agent://86f7ac75-4fc4-4b36-8130-40fa12602218   Running   3h21m   4.10.18

$ hypershift create kubeconfig > kubeconfig
$ export KUBECONFIG=$PWD/kubeconfig

$ oc get nodes
NAME                       STATUS   ROLES    AGE     VERSION
worker-0.example.krnl.es   Ready    worker   3h31m   v1.23.5+3afdacb
worker-1.example.krnl.es   Ready    worker   3h31m   v1.23.5+3afdacb

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.18   True        False         16s     Cluster version is 4.10.18