Kubernetes on Google Cloud Platform¶
This document will walk you through how to start a kubernetes cluster using the Google Kubernetes Engine (GKE) on Google Cloud Platform (GCP), run the byok8s Snakemake workflow on the GKE kubernetes cluster, and tear down the cluster when the workflow is complete.
Setup¶
Before you can create a kubernetes cluster on Google Cloud, you need a Google Cloud account and a Google Cloud project. You can sign up for a Google Cloud account here. You can create a new project from the Google Cloud Console. New accounts start with 300 free hours specifically to let you test drive features like GKE! Cool!
Once you have your account and your project, you can install
the gcloud
Google Cloud SDK command line utility
(see Google Cloud SDK Quickstart Guide).
Once you have installed the gcloud
utility, you will need
to log in with your Google acount using the init
command:
gcloud init
This will give you a link to enter into your browser, where you will log in with your Google account and recieve a code to copy and paste into the terminal.
The Compute API and Kubernetes API will both need to be enabled as well. These can be enabled via the Google Cloud Console (or read on).
If you aren't sure how to use the console to enable these APIs, just start running the commands below to create a kubernetes cluster, and the gcloud utility will let you know if it needs APIs enabled for actions. If it can't enable the API for you, it will give you a direct link to the relevant Google Cloud Console page.
Google Kubernetes Engine (GKE)¶
GKE uses GCP compute nodes to run a kubernetes cluster
on Google Cloud infrastructure. It automatically sets up the
cluster for you, and allows you to use kubectl
and gcloud
to
manage and interact with the remote cluster.
Official Google link: https://cloud.google.com/kubernetes-engine/
Quickstart¶
As mentioned, make sure your account credentials are initialized:
gcloud init
Create a new GKE cluster:
gcloud container clusters create $CLUSTER_NAME --num-nodes=$NODES --region=us-west1
The --scopes storage-rw
flag is required if you plan to use Google
Cloud buckets instead of S3 buckets (not currently enabled in byok8s).
Next get configuration details about the cluster so your local kubernetes controller can control the cluster:
gcloud container clusters get-credentials $CLUSTER_NAME
This will take several minutes.
The cluster should now be up and running and ready to rock:
$ kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE event-exporter-v0.2.3-54f94754f4-5jczv 2/2 Running 0 4m fluentd-gcp-scaler-6d7bbc67c5-hkllz 1/1 Running 0 4m fluentd-gcp-v3.1.0-48pb2 2/2 Running 0 2m fluentd-gcp-v3.1.0-58dpx 2/2 Running 0 2m fluentd-gcp-v3.1.0-c4b49 2/2 Running 0 2m fluentd-gcp-v3.1.0-h24m5 2/2 Running 0 2m fluentd-gcp-v3.1.0-hbdj4 2/2 Running 0 2m fluentd-gcp-v3.1.0-rfnmt 2/2 Running 0 2m fluentd-gcp-v3.1.0-vwd8w 2/2 Running 0 2m fluentd-gcp-v3.1.0-wxt79 2/2 Running 0 2m fluentd-gcp-v3.1.0-xkt42 2/2 Running 0 2m heapster-v1.5.3-bc9f6bfd5-7jhqs 3/3 Running 0 3m kube-dns-788979dc8f-l7hch 4/4 Running 0 4m kube-dns-788979dc8f-pts99 4/4 Running 0 3m kube-dns-autoscaler-79b4b844b9-j48js 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-9ad2912e-130p 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-9ad2912e-lfpw 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-9ad2912e-rt9m 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-b44fa389-2ds8 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-b44fa389-hc66 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-b44fa389-vh3x 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-d58ee1e7-2kkw 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-d58ee1e7-3l6r 1/1 Running 0 4m kube-proxy-gke-mycluster-default-pool-d58ee1e7-4w18 1/1 Running 0 4m l7-default-backend-5d5b9874d5-ms75l 1/1 Running 0 4m metrics-server-v0.2.1-7486f5bd67-2n6cn 2/2 Running 0 3m
Now assuming you have installed byok8s
and it is located
at ~/2019-snakemake-byok8s/
, you can run the test workflow
on the kubernetes cluster:
# Return to our virtual environment cd ~/2019-snakemake-byok8s/test/ source vp/bin/activate # Export AWS keys for Snakemake export AWS_ACCESS_KEY_ID="XXXXX" export AWS_SECRET_ACCESS_KEY="XXXXX" # Run byok8s byok8s workflow-alpha params-blue --s3-bucket=mah-bukkit
Once the workflow has run successfully, the results will be written to S3 buckets and all the kubernetes containers created by snakemake will be gone.
If all goes well, you should see output like this:
$ byok8s --s3-bucket=mah-bukkit -f workflow-alpha params-blue -------- details! snakefile: /home/ubuntu/2019-snakemake-byok8s/test/Snakefile config: /home/ubuntu/2019-snakemake-byok8s/test/workflow-alpha.json params: /home/ubuntu/2019-snakemake-byok8s/test/params-blue.json target: target1 k8s namespace: default -------- Building DAG of jobs... Using shell: /bin/bash Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 1 target1 1 Resources before job selection: {'_cores': 1, '_nodes': 9223372036854775807} Ready jobs (1): target1 Selected jobs (1): target1 Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806} [Mon Jan 28 23:49:51 2019] rule target1: output: cmr-0123/alpha.txt jobid: 0 echo alpha blue > cmr-0123/alpha.txt Get status with: kubectl describe pod snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d kubectl logs snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d Checking status for pod snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d Checking status for pod snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d Checking status for pod snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d Checking status for pod snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d Checking status for pod snakejob-1ab52bdb-903b-5506-b712-ccc86772dc8d [Mon Jan 28 23:50:41 2019] Finished job 0. 1 of 1 steps (100%) done Complete log: /home/ubuntu/2019-snakemake-byok8s/test/.snakemake/log/2019-01-28T234950.253823.snakemake.log unlocking removing lock removing lock removed all locks
Congratulations! You'e just run an executable Snakemake workflow on a Google Cloud kubernetes cluster!
You can get more information about the containers running each step of
the workflow using the kubectl describe
commands printed in the output.
Here is an example:
$ kubectl describe pod snakejob-c91f804c-805a-56a2-b0ea-b3b74bc38001 Name: snakejob-c91f804c-805a-56a2-b0ea-b3b74bc38001 Namespace: default Node: gke-mycluster-default-pool-b44fa389-vh3x/10.138.0.7 Start Time: Mon, 28 Jan 2019 23:55:18 -0800 Labels: app=snakemake Annotations: <none> Status: Running IP: 10.0.6.4 Containers: snakejob-c91f804c-805a-56a2-b0ea-b3b74bc38001: Container ID: docker://2aaa04c34770c6088334b29c0332dc426aff2fbbd3a8af07b65bbbc2c5fe437d Image: quay.io/snakemake/snakemake:v5.4.0 Image ID: docker-pullable://quay.io/snakemake/snakemake@sha256:f5bb7bef99c4e45cb7dfd5b55535b8dc185b43ca610341476378a9566a8b52c5 Port: <none> Host Port: <none> Command: /bin/sh Args: -c cp -rf /source/. . && snakemake cmr-0123/.zetaB1 --snakefile Snakefile --force -j --keep-target-files --keep-remote --latency-wait 0 --attempt 1 --force-use-threads --wrapper-prefix None --config 'name='"'"'blue'"'"'' -p --nocolor --notemp --no-hooks --nolock --default-remote-provider S3 --default-remote-prefix cmr-0123 --allowed-rules target3sleepyB1 State: Running Started: Mon, 28 Jan 2019 23:56:15 -0800 Ready: True Restart Count: 0 Requests: cpu: 0 Environment: AWS_ACCESS_KEY_ID: <set to the key 'aws_access_key_id' in secret 'e077a45f-1274-4a98-a76c-d1a9718707db'> Optional: false AWS_SECRET_ACCESS_KEY: <set to the key 'aws_secret_access_key' in secret 'e077a45f-1274-4a98-a76c-d1a9718707db'> Optional: false Mounts: /source from source (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-jmnv4 (ro) Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: source: Type: Secret (a volume populated by a Secret) SecretName: e077a45f-1274-4a98-a76c-d1a9718707db Optional: false workdir: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-jmnv4: Type: Secret (a volume populated by a Secret) SecretName: default-token-jmnv4 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 63s default-scheduler Successfully assigned snakejob-c91f804c-805a-56a2-b0ea-b3b74bc38001 to gke-mycluster-default-pool-b44fa389-vh3x Normal SuccessfulMountVolume 63s kubelet, gke-mycluster-default-pool-b44fa389-vh3x MountVolume.SetUp succeeded for volume "workdir" Normal SuccessfulMountVolume 63s kubelet, gke-mycluster-default-pool-b44fa389-vh3x MountVolume.SetUp succeeded for volume "default-token-jmnv4" Normal SuccessfulMountVolume 63s kubelet, gke-mycluster-default-pool-b44fa389-vh3x MountVolume.SetUp succeeded for volume "source" Normal Pulling 61s kubelet, gke-mycluster-default-pool-b44fa389-vh3x pulling image "quay.io/snakemake/snakemake:v5.4.0" Normal Pulled 10s kubelet, gke-mycluster-default-pool-b44fa389-vh3x Successfully pulled image "quay.io/snakemake/snakemake:v5.4.0" Normal Created 6s kubelet, gke-mycluster-default-pool-b44fa389-vh3x Created container Normal Started 6s kubelet, gke-mycluster-default-pool-b44fa389-vh3x Started container
Congratulations! You've successfully run an executable Snakemake workflow on a Google Cloud kubernetes cluster!
Delete the GKE cluster when you are done:
gcloud container clusters delete $CLUSTER_NAME