
Reputation
Badges 1
8 × Eureka!@<1523701205467926528:profile|AgitatedDove14> Yes, but that is not allowed (together with not clone ), as per the current implementation 😄
That would (likely) work, yes .. if it worked 🙂 However, remote_execute
kills the thread so the multirun stops at the first sub-task.
@<1523701205467926528:profile|AgitatedDove14> Because I want to schedule each sweep job as a task for remote execution, allowing for running each task in parallel on a worker.
CostlyOstrich36 Yes, I manually updated the port mapping in the docker-compose yaml. An alternative way would be to keep the 8080 port in the config, but then on the server forward all requests from 8080 to 80.
I believe that is the right terminology, yes.
SuccessfulKoala55 At peak we’ve been running ~50 experiments simultaneously that have been somewhat generous in reported metrics, although not extreme. Our CML server is hosted on an Azure D2S_v3 VM (2 vCPU, 8 GB RAM, 3200 IOPS). Looks like we should probably upgrade especially the disk specs. (Taking another look at our VM metrics we reached 100% OS disk IOPS consumed a couple of times.)
CostlyOstrich36 My particular Python error is due to a mismatch between my torch version and lightning. But the real issue is I do not have exact control of the version that is installed.
I'll do that. As a temporary workaround I'll create/schedule the tasks from an external script, and avoid using hydra multi-runs. (Which is a pity, so I'll be looking forward to a fix 😉 )