Reputation
Badges 1
53 × Eureka!We've sucessfully deployed it without helm with custom made docker-compose and makefiles 😄
That's only a part of a solution.
You'd also have to allow specifying jupyterhub_config.py
, mounting it inside a container at a right place, mounting the docker socket in a secure manner to allow spawning user containers, connecting them to the correct network ( --host
won't work), persisting the user database and user data...
CostlyOstrich36 jupyterhub is a multi-user server, which allows many users to login and spawn their own jupyterlab instances (with custom dependencies, data etc) for runing notebooks
AgitatedDove14 no errors, because I don't know how to start 😅 I am just exploring if anyone did this before I get my hands dirty
SuccessfulKoala55 sorry for the bump, what's the status of the fix?
It's not because of the remote machine, it's the requirements 😅 as i said, the package is not on pypi. Try adding this at the top of your requirements.txt:
-f
torch==1.12.1+cu113 ...other deps...
Yes, thank you. That's exactly what I'm refering to.
The agent is deployed on our on-premise machines
AgitatedDove14 Well, we have gotten relatively close to the goal, i suppose you wouldn't have to do a lot of work to support it natively
to answer myself, the first part, task.get_parameters()
retrieves a list of all the arguments which can be set. The syntax seems to be Args/{argparse destination}
However, this does not return the commit hash :((
No errors in logs, but that's because I restarted the deployment :(
I guess I'll let you know the next time this happens haha
Hello, a similar thing happened today. In the developer's console there was this line
https://server/api/v2.19/tasks.reset_many 504 (Gateway time-out)
This was actually a reset (of a one experiment) not a delete
i think you're right, the default elastic values do not seem to work for us
Okay, thank you for the suggestions, we'll try it out
For now, docker compose down && docker compose up -d
helps
But do consider a sort of a designer's press kit on your page haha
This means that an agent only ever spins up one particular image? I'd like to define different container images for different tasks, possibly even build them in the process of starting a task. Is such a thing possible?
It could work but slack demands a minimum of 512x512
CostlyOstrich36 this sounds great. How do I accomplish that?
Mostly the configurabilty of clearml-session
and how it was designed. Jupyterhub spawns a process at :8000 which we had to port foreward by hand, but spawning new docker containers using jupyterhub.Dockerspawner
and connecting them to the correct network (the hub should talk to them without --network host
) seem too difficult or even impossible.
Oh, and there was no JupyterHub stdout in the console output on clearml server, it shows the jupyterlab's output by default
Yes, that's right. We deployed it on a GCP instance
Nothing at all. There are only 2 logs from this day, and all were at 2am
we didn't change a thing from the defaults that's in your github 😄 so it's 500M?
I haven't looked, I'll let you know next time it happens
Errors pop in occasionally in the Web UI. All we see is a dialog with the text "Error"