Hi! Does ClearML have a way to turn on/off virtual machines depending if there are experiments on queue?
3 years ago
Hi! I am using the ModelCheckpoint callback from Tensorflow to save the best model. When the experiment finishes if I go on the server to Experiment > Artifa...
3 years ago
Hi! I was taking a look at the https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html and wanted to know if anyone has used clearml wit...
3 years ago
Hi! I have some agents on GCP. Lately I have been getting some experiments that simply stop running (no signs that the experiment crashed). Here is a plot th...
3 years ago
Hi! Regarding the artifact.get_local_copy() method, since there is no way to specify the path where the artifact will be downloaded, I wanted to confirm that...
4 years ago
Hi, is there a way to force the requirements.txt? I have a package I installed directly from github but the version is always wrong. Any other way to do this?
3 years ago
Hi! I am trying to run some experiments on an agent I have configured to use the requirements.txt the problem is it only shows Cython on the list of installe...
3 years ago
Hi! I am getting the following error on an agent: /usr/local/bin/python3.8: No module named virtualenv clearml_agent: ERROR: Command '['python3.8', '-m', 'vi...
2 years ago
Hi ! While restarting the server I got ERROR: for agent-services removal of container 8f1d8539340d6d073eb5b51294f5f5d802048a3614d459b5c4fb1d38a05ce538 is alr...
3 years ago
Hi! I recently updated my server and my clearml version, now when I set a task to be executed remotely its default state is aborted hence I have to reset and...
3 years ago
Hi! I have some ClearML agents on GCP and sometimes the instance seems to reboot making the experiment fail and all the progress is lost. What is the best wa...
2 years ago
Hi! If I have a folder with multiple ckpt files would the manual way to upload them be the following: output_model = OutputModel(task) output_model.update_we...
2 years ago
Hi! I have the previous trains server configured with multiple experiments; I created it using the gcloud images provided. If I want to update the server to ...
3 years ago
Hi! I am trying to download data from GS using StorageManager.get_local_copy() . It works fine when I point it to a file i.e gs://bucket/dataset/image.png bu...
4 years ago
Hi! If I have a pipeline on gitlab that uses ClearML for some tests is there some way to setup the credentials so that it doesn’t fail?
3 years ago
Quick question on the clearml-data package, Can I add files to a dataset from google storage instead of having to download them?
3 years ago
Hi! I am having some problems with a loss after a good amount of training, what would be the best way to log a value to have a better idea of what is happening?
2 years ago
I am trying to upgrade from clearml server 0.16 to the newest version but I am getting some errors when spinning up the new containers: WiredTiger error (-31...
3 years ago
Hi! Is there a way to run a task without reporting to the server? For example if I want to debug a script by running it locally without it appearing on the s...
3 years ago
Hi! Is there something happening with the ModelCheckpoint callback on tensorflow==2.4.0 ? Using 2.2.0 gave me an input model on the artifacts tab in the GUI 😢
3 years ago
Hi, with the upcoming version of Hydra it seems the binding breaks. Specifically in the run_job function the argument order changed from https://github.com/f...
3 years ago
I am also experiencing a weird behaviour when running a script using the module flag. For example I run: python -m module.script arg1 arg 2And after the scri...
4 years ago
Hi! What would be the way for manually uploading a model? I have intermediate .pt files which I don't want to upload. Is there a way to turn off clearml capt...
3 years ago
Hi all! Is there a way for trains to recognize the CLI arguments when using https://github.com/google/python-fire instead of argparse?
4 years ago
Hi! Any idea why clearml fails to detect iteration reporting? ClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-fr...
3 years ago
Hi, I was getting a really weird error due to mismatch on the versions between the installed libraries in my environment and the ones ran in the node (I manu...
4 years ago