Jupyter on Scaleway Kubernetes

July 22, 2022

In this post we will see how to run your jupyter notebook on Scaleway Kubernetes and use the autoscaler to be billed only when you run your notebook.

Create the cluster

I'll assume you have the scaleway CLI installed and configured on your machine. First we create the cluster and get the kubeconfig to use it with kubectl.

Create a Kapsule cluster

scw k8s cluster create name=my-super-cluster
export CLUSTER_ID=`scw k8s cluster list name=my-super-cluster | jq -r ".[0].id"`

scw k8s kubeconfig get $CLUSTER_ID > kubeconfig.yaml
export KUBECONFIG=$PWD/kubeconfig.yaml

Create pools

Create a pool to run controllers

Now we will create a pool with one node, this pool will always have 1 node. The purpose of this node is to run our controllers. Here I use the DEV1_M instance type which is ce smallest instance type you can use with Kapsule.

scw k8s pool create cluster-id=$CLUSTER_ID name=controllers-pool node-type=DEV1_M size=1

Create a GPU pool to run ML workload

Here we create a GPU pool using the GPU-3070-S type, it come with a 8 cores, 16GiB of ram and a RTX3070 GPU and only cost 0.85€/hour at the time I'm writing. We will set the number of instance in that pool to 0 but we will activate the autoscaling and set the min to 0 and max to 10. This way the cluster autoscaler will ajust the number of instance in that range depending on the workload running in the cluster.

scw k8s pool create cluster-id=$CLUSTER_ID name=gpu-pool-1 node-type=GPU-3070-S size=0 autoscaling=true min-size=0 max-size=10

Then we need to install nvidia device plugin to make the GPU available to the kubernetes cluster.

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml

Install/Configure the helm chart

Now that we have our cluster created it time to install jupyter in it.

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
# get the value file to kustomize it
helm get values jupyterhub/jupyterhub > jupyter-values.yaml

The helm chart give use a plenty of values to play with. We will configure the type pods the operator will spawn and the authentication.

Profile

To define the different pod that a user will be able to spawn we while change the config at singleuser.profileList We create two profiles one to code and load our data. The other profile will be use to run our GPU intensive code. Your can configure the image field to use your container with all the libraries you need and jupyter.

    profileList:
      - display_name: "1 CPU"
        description: "Simple instance with 1 CPU"
        kubespawner_override:
          image: "put-here-your-cpu-image"
          extra_resource_limits:
            cpu: "1"
      - display_name: "8 CPU, 1 RTX 3070"
        description: "Spawn a notebook with an RTX 3070 and 8 CPUs"
        kubespawner_override:
          image: "put-here-your-gpu-image"
          extra_resource_limits:
            nvidia.com/gpu: "1"
            cpu: "8"

Auth

You can define the type of authentication you wan't, in this post we will use the DummyAuthenticator but you can configure any other authenticator available here: toto put link

hub:
  config:
    JupyterHub:
      admin_access: true
      authenticator_class: dummy
    DummyAuthenticator:
      password: change-this-weak-password-please

Here is some other think you may wan't to change

singleuser.storage.capacity

Install the helm chart

helm upgrade --install --namespace jupyter --create-namespace --values=jupyter-values.yaml jupyterhub jupyterhub/jupyterhub