Yes! I will take a look at it!
AgitatedDove14 from this thread I understand hydra is not supported and therefore overriding the parameters from the UI wont work, but is there still a way to track and add the parameters to the experiment? Will task.connect_configuration
work with the yaml files?
Just to make sure get everything right AgitatedDove14 :
We have to define the Task inside the function decorated with the @hydra.main We can modify the parameters that are overridden on UI on : configuration tab -> Args -> overrides -> modify the listAdditional question:
Will the sweep functionality work?
Oh I think I am wrong! Then it must be the clearml monitoring. Still it fails way before the timer ends.
I get the URL to the checkpoint/weights
can I use this to download the weights?
Hey AgitatedDove14 do you have an implementation for gcloud? this is awesome
There are also ways to override the parameters as stated https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html#use-of-command-line-arguments .
AgitatedDove14 Downloading a dataset would not be possible using this right? I want to be able to access the data just avoid reporting the experiment results
Managed to get:
clearml_agent: ERROR: Command '['/home/ramon/.clearml/venvs-builds/3.9/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/var/tmp/requirements_tb0x2i3j.txt', '--extra-index-url', '
died with <Signals.SIGKILL: 9>.
while building the task with the id on the agent
Awesome AgitatedDove14 Thanks a lot π
If you try:ModelCheckpoint('best_model.hdf5', save_best_only=True)
does it work too?
Is this caused by running the script with the arguments?
I am about to try everything AgitatedDove14 but ran into a gitlab error from the agent, I added the username and password to the configuration file but still get a Host key verification failed
. Is it common that the cloning message shows the SSH
link instead of the HTTPS
when username and password are provided?
Last question CostlyOstrich36 sorry to poke you! Seems even though if I set an extremely long time it will still fail when the first plots are reported. The first plots are generated automatically by pytorch lightning and track the cpu and gpu usage. Do you think this could be the cause? or should it also detect the iteration.
Yes, itβs similar; somewhat more automatic since it detects the classes of functions arguments and generates the CLI. What do you mean by that AgitatedDove14 get all the parameters and use task.connect
?
On the server through the command line?
Not yet AgitatedDove14 , does the agent use by default the python version the command is run with? I installed conda and tried using package_manager.type=conda
but then get an error:clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'
Yes! What env variables should I pass
Thats really cool! But I would still prefer avoid using pip_freeze, is there a way?
So should I set them all with a default value? The working dir is the project one, the one that contains the module
package
Makes sense! Then where would I have to add output_uri
to save the weights?