Kubernetes

UM-Bridge provides a kubernetes-based solution for running any UM-Bridge model container on cloud platforms at HPC scale.

These instructions show how to use the UM-Bridge kubernetes configuration. It assumes that you have a kubernetes cluster available, for example by following our insructions for setting up Google Kubernetes Engine (GKE).

Step 1: Clone UM-Bridge

First, clone the UM-Bridge repository by running:

git clone https://github.com/UM-Bridge/umbridge.git

In the cloned repository, navigate to the folder kubernetes. It contains all that is needed in the following.

Step 2: Set up load balancer

First, retrieve HAProxy:

helm repo add haproxytech https://haproxytech.github.io/helm-charts
helm repo update
helm install kubernetes-ingress haproxytech/kubernetes-ingress \
--create-namespace \
--namespace \
haproxy-controller \
--set controller.service.type=LoadBalancer \
--set controller.replicaCount=1 \
--set defaultBackend.replicaCount=1 \
--set controller.logging.level=debug \
--set controller.ingressClass=haproxy

Then, start HAProxy with the configuration provided by UM-Bridge:

kubectl apply -f setup/svc-ingress.yml

Step 3: Run model instances

To start model instances, run:

kubectl create -f model.yaml

You can check the status of the model instances by running:

kubectl get pods

They can be deleted again via:

kubectl delete -f model.yaml

The default model.yaml may be adjusted to run your own model by changing:

  • image: The docker image containing a model with UM-Bridge support. Any image provided by the UM-Bridge project will work right away.

  • replicas: The number of model instances to run.

  • limits/requests: The CPU and memory resources your model should receive. Keep limits equal to requests to ensure that your model has exclusive access to a fixed amount of resources.

  • env: Environment variables that should be passed to the model.

Step 4: Calling the model

The model instances are now available through your load balancer’s IP address, which you can determine from:

kubectl get services --namespace=haproxy-controller

The model instances may be accessed from any UM-Bridge client, and up to replicas requests will be handled in parallel.

Multinode MPI on kubernetes

The instructions above work for any UM-Bridge model container, even ones that are MPI parallel. However, a single container is naturally limited to a single physical node. In order to parallelize across nodes (and therefore across containers) via MPI, the additional steps below are needed.

Step 1: mpi-operator base image

The multinode MPI configuration makes use of the mpi-operator from kubeflow. This implies that the mode base image has to be constructed via one of the following base images, depending on MPI implementation:

  • mpioperator/openmpi-builder

  • mpioperator/intel-builder

When separating between builder and final image, the corresponding base images may be used for the latter:

  • mpioperator/openmpi

  • mpioperator/intel

Step 2: Deploy mpi-operator

In addition to choosing a suitable base image for the model, the mpi-òperator needs to be deployed on the cluster:

kubectl apply -f https://raw.githubusercontent.com/kubeflow/mpi-operator/master/deploy/v2beta1/mpi-operator.yaml

Step 3: Setting up NFS

The multinode MPI setup mounts a shared (NFS) file system on the /shared directory of your model container, replicating a traditional HPC setup. The NFS server is set up via:

kubectl apply -f setup/nfs.yaml

Note: This assumes a disk ‘gce-nfs-disk’ to be set up in GCE!

In order to finish the setup we need the IP address. We can get this from

kubectl describe service

Change setup/nfs-pv-pvc.yaml to the IP address you just retrieved, and (if needed) adjust storage capacity to the attached disk.

Then run:

kubectl apply -f setup/nfs-pv-pvc.yaml

Step 4: Running a job on the new cluster

The job configuration is located in multinode-mpi-model.yaml. It is largely analogous to model.yaml, except that both launcher and worker containers are configured. The relevant additional config options are:

  • slotsPerWorker: The number of MPI ranks per worker container.

  • mpiImplementation: By default set to OpenMPI, but can be changed to Intel.

  • command: The launcher is expected to run the UM-Bridge server, which then should call mpirun for model evaluations. Workers should only execute the (pre-defined) sshd command in order to listen for requests from the launcher.

  • replicas: Expected to be 1 for the launcher, and the number of model instances you want to run for the workers.

The multinode-mpi-model.yaml file describes a single launcher with a number of workers assigned to it. In order to run multiple jobs (each a launcher and multiple MPI parallel workers), run the following script:

bash launch-multinode-mpi-model.sh

It creates a number of jobs from the multinode-mpi-model.yaml file, each time substituting JOB_INDEX for a unique index.

Just as before, access to the model instances is now available via the load balancer’s IP address, as described in Step 4 of the single-node setup.

All jobs can be shut down via:

kubectl delete MPIJob --all