Jupyter on Scaleway Kubernetes
July 22, 2022
In this post we will see how to run your jupyter notebook on Scaleway Kubernetes and use the autoscaler to be billed only when you run your notebook.
Create the cluster
I'll assume you have the scaleway CLI installed and configured on your machine. First we create the cluster and get the kubeconfig to use it with kubectl.
Create a Kapsule cluster
scw k8s cluster create name=my-super-clusterexport CLUSTER_ID=`scw k8s cluster list name=my-super-cluster | jq -r ".[0].id"`
scw k8s kubeconfig get $CLUSTER_ID > kubeconfig.yamlexport KUBECONFIG=$PWD/kubeconfig.yaml
Create pools
Create a pool to run controllers
Now we will create a pool with one node, this pool will always have 1 node. The purpose of this node is to run our controllers.
Here I use the DEV1_M
instance type which is ce smallest instance type you can use with Kapsule.
scw k8s pool create cluster-id=$CLUSTER_ID name=controllers-pool node-type=DEV1_M size=1
Create a GPU pool to run ML workload
Here we create a GPU pool using the GPU-3070-S
type, it come with a 8 cores, 16GiB of ram and a RTX3070 GPU and only cost 0.85€/hour at the time I'm writing.
We will set the number of instance in that pool to 0 but we will activate the autoscaling and set the min to 0 and max to 10.
This way the cluster autoscaler will ajust the number of instance in that range depending on the workload running in the cluster.
scw k8s pool create cluster-id=$CLUSTER_ID name=gpu-pool-1 node-type=GPU-3070-S size=0 autoscaling=true min-size=0 max-size=10
Then we need to install nvidia device plugin to make the GPU available to the kubernetes cluster.
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
Install/Configure the helm chart
Now that we have our cluster created it time to install jupyter in it.
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/helm repo update# get the value file to kustomize ithelm get values jupyterhub/jupyterhub > jupyter-values.yaml
The helm chart give use a plenty of values to play with. We will configure the type pods the operator will spawn and the authentication.
Profile
To define the different pod that a user will be able to spawn we while change the config at singleuser.profileList
We create two profiles one to code and load our data.
The other profile will be use to run our GPU intensive code.
Your can configure the image field to use your container with all the libraries you need and jupyter.
profileList:- display_name: "1 CPU"description: "Simple instance with 1 CPU"kubespawner_override:image: "put-here-your-cpu-image"extra_resource_limits:cpu: "1"- display_name: "8 CPU, 1 RTX 3070"description: "Spawn a notebook with an RTX 3070 and 8 CPUs"kubespawner_override:image: "put-here-your-gpu-image"extra_resource_limits:nvidia.com/gpu: "1"cpu: "8"
Auth
You can define the type of authentication you wan't, in this post we will use the DummyAuthenticator but you can configure any other authenticator available here: toto put link
hub:config:JupyterHub:admin_access: trueauthenticator_class: dummyDummyAuthenticator:password: change-this-weak-password-please
Here is some other think you may wan't to change
singleuser.storage.capacity
Install the helm chart
helm upgrade --install --namespace jupyter --create-namespace --values=jupyter-values.yaml jupyterhub jupyterhub/jupyterhub