Hi AgitatedDove14 . I'm just writing to explain what was the problem. Basically our setup - jupyterhub on k8s with kubespawner that was spawning a pod for each single user notebook, uses docker images that are based on jupyter/docker-stacks.
The problem was that the token for jupyterhub api was not propagated to the spawned pod so whenever clearml was trying to access jupyter/user/api/sessions endpoint it would be redirected for authorization to jupyterhub api and then fail due to the lack of required token. The other problem was that the runtime directory for jupyter was defaulting to a shared folder so whenever there were more than two single servers pods, the list_running_servers() function was failing as the shared file could not be loaded. The info from list_running_server is used by clearml to know where to send requests to and as it could not read the file - it failed.
Our solution was to basically create a new runtime directory for jupyter but this time not in a shared folder and point jupyter towards it with an env variable. And the token for jupyterhub_api is later added by directly modifying the nbserver-x.json file in runtime directory. (The token is passed from jupyterhub to single-user servers as an env variable "JUPYTERHUB_API_TOKEN").
So as you can see the issue is on the jupyterhub side.
That were my thoughts too. But the jupyter/base_notebook from docker stacks that they recommend to use and from which my image inherits did not include the token in the jupyter lab run command. I don't know whether it was a bug or an intentional choice, however I was either going to change the base image, or to add a token in a postStart hook. I decided to go with the second option 😉
GreasyPenguin66 Nice !!!
Very cool setup, and kudos on making it work with multiple users!
Quick question, shouldn't the JUPYTERHUB_API_TOKEN env variable be enough to gain access to the server? Why did you need to add it to the 'nbserver-x.json' as well?
So the way clearml can store your notebook is by using the jupyter-notebook rest api. It assumes, that it can communicate with it as the kernel is running on the same machine. What exactly is the setup? is the jupyter-lab/notebook running inside the docker? maybe the docker itself is running with some --network argument ?