
Reputation
Badges 1
52 × Eureka!ok so the documentation is confusing here:
This is what I see:
Hi @<1523701205467926528:profile|AgitatedDove14> thanks for your reply. I am seeing this is an issue with torch 2.0.1 because it does not install the needed cuda dependencies:
Adding this info here, in case anyone here has this issue. It looks like switching to torch 2.0.0 fixes the issue. I will update here after I test that. Thanks again 🙏
@<1523701087100473344:profile|SuccessfulKoala55> thanks so much for your reply. I can see now the source of my confusion:
After I finished deploying the server in AWS , the next step in that page is “ configuring ClearML for [ ClearM...
so if I want to refer to batch_size
in my_hydra_config.yaml
:
# dummy config file
trainer:
params:
batch_size: 32
do I pass this to the HyperParameterOptimizer
as:
Hydra/trainer/params/batch_size
??
@<1523701205467926528:profile|AgitatedDove14> 👆 ? Thanks
Do you have any insights on the missing fileserver @<1523701205467926528:profile|AgitatedDove14> ?
Hi @<1523701435869433856:profile|SmugDolphin23> thanks for your answer. I am not sure that I understand. I ran a test by cloning and experiment and editing the OmegaConf object under Configuration > Hyperparameters > OmegaConf.
Unless I also change the allow_omegaconf_edit
flag to True
, I won’t see my changes reflected. That is my question. As a new user, it seems counterintuitive that I have to also change the flag. Does this make sense to you? Thanks.
if that were the case it explains why I see /opt/clearml/data/fileserver
but no /mnt/fileserver
….
Will this work?
task.connect(OmegaConf.to_object(cfg))
assuming cfg
is my Hydra dict
Thanks @<1523701205467926528:profile|AgitatedDove14> happy to PR on the docs 😉
so it’s not intuitive to me to try Hydra/params.batch_size
I will try it nonetheless as you suggested.
@<1523701087100473344:profile|SuccessfulKoala55> I changed my agent to poetry mode it and it worked like magic. Thanks Jake!
hmm….. probably simpler/cleaner if I do
hpo_params = {
'param1':cfg.param_1, ...
}
task.connect(hpo_params)
Thoughts?
@<1523701205467926528:profile|AgitatedDove14> None
Hey @<1523701205467926528:profile|AgitatedDove14> in the WebUI the hydra configuration object is under CONFIGURATION OBJECTS > OmegaConf
So should this be OmegaConf/trainer.batch_size
?
Thanks @<1523701205467926528:profile|AgitatedDove14> reading …
@<1547028031053238272:profile|MassiveGoldfish6> check this:
- does your local
clearml.conf
should useuse_credentials_chain:true
? - Do you have the needed AWS credentials in your local environment?
- Do you have an S3 bucket as the storage for your project (did you set this up when you created the project)?
- Do your local AWS credentials give you write access to that S3 bucket?
so it looks like the server is there (docker ps), I can see the artifacts (web ui), but not sure where things are as per documentation there is no /mnt/fileserver
(?)
hmmm… probably not if I don’t have a reference that clearml can update right?….
What about:
hpo_params = OmegaConf.to_object(cfg)
...
task.connect(hpo_params)
And then I use hpo_params
in the code. This way I give clearml a chance to update the object.
Would this work? Thanks
sorry I am a noob not sure how can do that but happy to help if I can
Hi @<1523701087100473344:profile|SuccessfulKoala55> thanks for your response. What I mean is that in the Web UI when you are creating a project you have storage (S3) field at the bottom of the create project pop-up, where you enter the S3 bucket that you want to associate with the project. Now, the thing is, you can’t visualize that information after the project is created, anywhere in the UI, as far as I can tell. So, it would be great to be able to see the configured bucket somewhere in...
Hey @<1593051292383580160:profile|SoreSparrow36> I am trying to test that if I delete a project the S3 storage gets also deleted. But I am not sure this is even a good assumption as I haven’t found anywhere what the expected/default behaviour is. Do you happen to know anything about this? Thanks.
A related question… how does the server know how to delete artifacts when the project is deleted if it doesn’t have a clearml.conf
with the S3 credentials to do so?
but from a terminal I can do:
ubuntu@***:~/sw/clearml-tutorial$ git fetch --all --recurse-submodules
Fetching origin
and it works