
Reputation
Badges 1
52 × Eureka!Thanks Martin. This is the first step out of many…
so it looks like the server is there (docker ps), I can see the artifacts (web ui), but not sure where things are as per documentation there is no /mnt/fileserver
(?)
@<1523701087100473344:profile|SuccessfulKoala55> thanks so much for your reply. I can see now the source of my confusion:
After I finished deploying the server in AWS , the next step in that page is “ configuring ClearML for [ ClearM...
so it’s not intuitive to me to try Hydra/params.batch_size
I will try it nonetheless as you suggested.
I just ran a dummy experiment logging images, plots, etc and I can see them in my server’s Web UI.
Responding to my own question, in case someone else has the same issue. You have to edit the security group and enable TCP 8080.
I haven’t figure out the missing fileserver? :man-shrugging:
3fdcf5db64d allegroai/clearml:1.12.1-397 “/opt/clearml/wrappe…” 10 days ago Up 9 minutes 8008/tcp, 8080/tcp, 0.0.0.0:8081->8081/tcp, :::8081->8081/tcp clearml-fileserver
ok so the documentation is confusing here:
@<1523701205467926528:profile|AgitatedDove14> Got the overrides working with Hydra/params.batch_size
thank you 🙏
if that were the case it explains why I see /opt/clearml/data/fileserver
but no /mnt/fileserver
….
so if I want to refer to batch_size
in my_hydra_config.yaml
:
# dummy config file
trainer:
params:
batch_size: 32
do I pass this to the HyperParameterOptimizer
as:
Hydra/trainer/params/batch_size
??
@<1523701205467926528:profile|AgitatedDove14> 👆 ? Thanks
from this video tutorial None :
“…the name of the hyperparameter consist of the section is reported to followed by a slash then its name…”
So following that confuses me because I can’t see my Hydra parameters under Hyperparameters > Hydra
and this is why I thought, ok well, perhaps use OmegaConf/params.batch_size
Is this another opportunity to improve the documentation? Happy to help if so.
Do you have any insights on the missing fileserver @<1523701205467926528:profile|AgitatedDove14> ?
Thanks @<1523701205467926528:profile|AgitatedDove14> reading …
What I am referring to is this information about the Storage Configuration:
None
This is what I see:
@<1523701205467926528:profile|AgitatedDove14> None
Thanks @<1523701205467926528:profile|AgitatedDove14> happy to PR on the docs 😉
Will this work?
task.connect(OmegaConf.to_object(cfg))
assuming cfg
is my Hydra dict
I see this in the docker-compose.yml
file:
fileserver:
networks:
- backend
- frontend
command:
- fileserver
container_name: clearml-fileserver
image: allegroai/clearml:1.12.1-397
environment:
CLEARML__fileserver__delete__allow_batch: "true"
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
- /opt/clearml/data/fileserver:/mnt/fileserver
- /opt/clearml/config:/opt/clearml/config
ports:
- "8081:...
Hey @<1523701205467926528:profile|AgitatedDove14> in the WebUI the hydra configuration object is under CONFIGURATION OBJECTS > OmegaConf
So should this be OmegaConf/trainer.batch_size
?
Hi @<1523701205467926528:profile|AgitatedDove14> , I see _allow_omegaconf_edit_
under HYPERPARAMETERS > Hydra
Hi @<1523701087100473344:profile|SuccessfulKoala55> it’s failing again.. I haven’t rebooted the agent or changed anything and I am able to connect with ssh with ssh -vT
git@github.com on a different tmux sess.
This is the error I am seeing running the agent with the -debug
flag:
Using cached repository in "/home/ubuntu/.clearml/vcs-cache/clearml-tutorial.git.e1c2351b09f3d661b6f0dbf85e92be2e/clearml-tutorial.git"
git@github.com: Permission denied (pub...