Reputation
Badges 1
53 × Eureka!I don't think I expressed myself well 😅
My problem is I don't know how to run a jupyterhub Task. Basically what I want is a clearml-session but with a docker container running JupyterHub instead of JupyterLab.
Do I write a Python script? If yes, how can I approach writing it? If not, what are the altenatives?
By language, I meant the syntax. What is Args and what is batch in Args/batch and what other values exist 😀
By commit hash, I mean the hash od the commit a task was run from. I wish to refer to that commit hash in another task (started with a triggerscheduler) in code
I succeeded with your instructions, so thank you!
However, we concluded that we don't want to run it through ClearML after all, so we ran it standalone.
But, I'll update you if we ever run it with ClearML so you could also provide it
Mostly the configurabilty of clearml-session and how it was designed. Jupyterhub spawns a process at :8000 which we had to port foreward by hand, but spawning new docker containers using jupyterhub.Dockerspawner and connecting them to the correct network (the hub should talk to them without --network host ) seem too difficult or even impossible.
Oh, and there was no JupyterHub stdout in the console output on clearml server, it shows the jupyterlab's output by default
Okay, thank you for the suggestions, we'll try it out
Yes, thank you. That's exactly what I'm refering to.
The agent is deployed on our on-premise machines
AgitatedDove14 Well, we have gotten relatively close to the goal, i suppose you wouldn't have to do a lot of work to support it natively
I haven't looked, I'll let you know next time it happens
Hello, a similar thing happened today. In the developer's console there was this line
https://server/api/v2.19/tasks.reset_many 504 (Gateway time-out)
Ok great. We were writing clearml triggers and they didn't work with "aborted". 😅
I would kindly suggest perhaps adding a set of all statuses in the docs
That's only a part of a solution.
You'd also have to allow specifying jupyterhub_config.py , mounting it inside a container at a right place, mounting the docker socket in a secure manner to allow spawning user containers, connecting them to the correct network ( --host won't work), persisting the user database and user data...
Errors pop in occasionally in the Web UI. All we see is a dialog with the text "Error"
trigger.add_task_trigger(name='export', schedule_task_id=SCHEDULE_ID, task_overrides={...})I would like to override the commit hash of the SCHEDULE_ID with task_overrides
Thank you, I understand now :D
When installing locally you said to pip to look for packages at that page, and you dont say that to the remote pip
SOLVED: It was an expired service account key in a clearml config
i think you're right, the default elastic values do not seem to work for us
It is likely you have mismatched cuda. I presume you locally have cu113 but cu114 remotely. Were you running any updates lately?
I think I know why though.
Clearml tries to install a package using pip, and pip cannot find the installation because it's not on pypi but it's listed in the pytorch download page
This was actually a reset (of a one experiment) not a delete
SuccessfulKoala55 sorry for the bump, what's the status of the fix?
This means that an agent only ever spins up one particular image? I'd like to define different container images for different tasks, possibly even build them in the process of starting a task. Is such a thing possible?
No errors in logs, but that's because I restarted the deployment :(
I tried to build allegroai/clearml-agent-services on my laptop with ubuntu:22.04 and it failed
I guess I'll let you know the next time this happens haha
CostlyOstrich36 jupyterhub is a multi-user server, which allows many users to login and spawn their own jupyterlab instances (with custom dependencies, data etc) for runing notebooks
AgitatedDove14 no errors, because I don't know how to start 😅 I am just exploring if anyone did this before I get my hands dirty