It is the latest RC, I get the following:
` Executing Conda: /opt/conda/bin/conda install -p /home/ramon/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults 'pip<20.2' --quiet --json
Pass
Trying pip install: /home/ramon/.clearml/venvs-builds/3.8/task_repository/my-rep.git/requirements.txt
Executing Conda: /opt/conda/bin/conda install -p /home/ramon/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults numpy==1.20.3 --quiet --json
Pass
Warning, could not locate PyTorch to...
Is this caused by running the script with the arguments?
AgitatedDove14 update here! Something like this should work:from trains import StorageManager from trains.storage.helper import StorageHelper bucket = 'gs://bucket' helper = StorageHelper.get(bucket) remote_files = helper.list('folder') for f in remote_files: StorageManager.get_local_copy(bucket + "/" + f)
the *
gives []
results since one the list
method startswith
is used which uses it as a string and not as a wildcard
For option 2 do I have to configure it on all agents or on the server?
AgitatedDove14 Downloading a dataset would not be possible using this right? I want to be able to access the data just avoid reporting the experiment results
Hi AgitatedDove14 thanks for your reply, with the dashboard I meant the Web-App (UI) . I am trying to access http://<External IP>:8080
but unfortunately nothing shows up.
Hey AgitatedDove14 after playing around seems that if the callback filepath points to an hdf5 file it is not uploaded.
I feel itโs easier not to report than cleaning after but please correct me if I am overthinking it. Iโll check if I could wrap the code in something that calls the Task.delete if debugging
Basically one points to an hdf5 and the other one has no extensiion
I am about to try everything AgitatedDove14 but ran into a gitlab error from the agent, I added the username and password to the configuration file but still get a Host key verification failed
. Is it common that the cloning message shows the SSH
link instead of the HTTPS
when username and password are provided?
So should I set them all with a default value? The working dir is the project one, the one that contains the module
package
Yes, itโs similar; somewhat more automatic since it detects the classes of functions arguments and generates the CLI. What do you mean by that AgitatedDove14 get all the parameters and use task.connect
?
No, I have all the packages with a version. I just want to know if there is a way to override the requirements versions detected by Pigar when using detect_with_pip_freeze: false
. I have locally cloudpickle==1.4.1
but when running the code and sending the task to the node the environment uses cloudpickle==1.6.0
. I have to manually change the version on the UI. Is there a way to force this single package to have a version? Maybe on the requirments.txt or something similar
Not yet AgitatedDove14 , does the agent use by default the python version the command is run with? I installed conda and tried using package_manager.type=conda
but then get an error:clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'
Hey CostlyOstrich36 ! I am using clearml==1.1.2
and clearml-agent==1.1.0
. Stopped is not the right word, more like frozen, it just froze at an epoch. The console on the agent shows epoch 33 first batch and the one at the server epoch 32 last batch. The experiment was running for ~6 hours.
I am using pytorch_lightning
, I'll try to create a snippet I can share! Thanks ๐