
Reputation
Badges 1
53 × Eureka!No errors in logs, but that's because I restarted the deployment :(
Okay I found your twitteer profile pic to be adequate after upsampling. Thank you and sorry 😅
It's not because of the remote machine, it's the requirements 😅 as i said, the package is not on pypi. Try adding this at the top of your requirements.txt:
-f
torch==1.12.1+cu113 ...other deps...
we didn't change a thing from the defaults that's in your github 😄 so it's 500M?
It is likely you have mismatched cuda. I presume you locally have cu113 but cu114 remotely. Were you running any updates lately?
For now, docker compose down && docker compose up -d
helps
I haven't looked, I'll let you know next time it happens
Ok great. We were writing clearml triggers and they didn't work with "aborted". 😅
I would kindly suggest perhaps adding a set of all statuses in the docs
i think you're right, the default elastic values do not seem to work for us
Okay, thank you for the suggestions, we'll try it out
Yeah, you are right.
We use an empty queue to enqueue our tasks in, just to trigger the scheduler 😅 it's only importance is that the experiment is not enqueued anywhere else, but the trigger then enqueues it
It's just that the trigger is never triggered
(Except when a new task is created - this was not the case)
By language, I meant the syntax. What is Args
and what is batch
in Args/batch
and what other values exist 😀
By commit hash, I mean the hash od the commit a task was run from. I wish to refer to that commit hash in another task (started with a triggerscheduler) in code
Yeah, sorry I typoed 😅 "newer than 18.04" was I supposed to say
What I meant was that we rebuilt them with 22.04
I tried to build allegroai/clearml-agent-services on my laptop with ubuntu:22.04
and it failed
Errors pop in occasionally in the Web UI. All we see is a dialog with the text "Error"
trigger.add_task_trigger(name='export', schedule_task_id=SCHEDULE_ID, task_overrides={...})
I would like to override the commit hash of the SCHEDULE_ID
with task_overrides
I succeeded with your instructions, so thank you!
However, we concluded that we don't want to run it through ClearML after all, so we ran it standalone.
But, I'll update you if we ever run it with ClearML so you could also provide it
CostlyOstrich36 jupyterhub is a multi-user server, which allows many users to login and spawn their own jupyterlab instances (with custom dependencies, data etc) for runing notebooks
AgitatedDove14 no errors, because I don't know how to start 😅 I am just exploring if anyone did this before I get my hands dirty
You are not missing nothing, it is what we would like to have, to allow multiple people have their own notebook servers. We have multiple people doing different experiments, and JupyterHub would be their "playground" environment
We've sucessfully deployed it without helm with custom made docker-compose and makefiles 😄
AgitatedDove14 Well, we have gotten relatively close to the goal, i suppose you wouldn't have to do a lot of work to support it natively
I guess I'll let you know the next time this happens haha