Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
GrievingTurkey78
Moderator
34 Questions, 125 Answers
  Active since 10 January 2023
  Last activity 10 months ago

Reputation

0

Badges 1

119 × Eureka!
0 Votes
10 Answers
643 Views
0 Votes 10 Answers 643 Views
Hi! I am getting the following error on an agent: /usr/local/bin/python3.8: No module named virtualenv clearml_agent: ERROR: Command '['python3.8', '-m', 'vi...
2 years ago
0 Votes
3 Answers
652 Views
0 Votes 3 Answers 652 Views
Hi! I recently updated my server and my clearml version, now when I set a task to be executed remotely its default state is aborted hence I have to reset and...
3 years ago
0 Votes
4 Answers
772 Views
0 Votes 4 Answers 772 Views
Hi, is there a way to force the requirements.txt? I have a package I installed directly from github but the version is always wrong. Any other way to do this?
3 years ago
0 Votes
8 Answers
526 Views
0 Votes 8 Answers 526 Views
Hello πŸ‘‹ I am using a self hosted clearml setup using the requirments file of the project. When I run the task it is failing and I get: Collecting torch==2.0...
11 months ago
0 Votes
15 Answers
630 Views
0 Votes 15 Answers 630 Views
Hi
Hi πŸ‘‹ I am trying to set up a trains server on GCP. I followed all the steps listed here https://allegro.ai/docs/deploying_trains/trains_server_gcp/ . I also...
3 years ago
0 Votes
2 Answers
611 Views
0 Votes 2 Answers 611 Views
Hi! Regarding the artifact.get_local_copy() method, since there is no way to specify the path where the artifact will be downloaded, I wanted to confirm that...
3 years ago
0 Votes
2 Answers
674 Views
0 Votes 2 Answers 674 Views
I am trying to upgrade from clearml server 0.16 to the newest version but I am getting some errors when spinning up the new containers: WiredTiger error (-31...
3 years ago
0 Votes
17 Answers
633 Views
0 Votes 17 Answers 633 Views
3 years ago
0 Votes
4 Answers
673 Views
0 Votes 4 Answers 673 Views
Hi! I am having some problems with a loss after a good amount of training, what would be the best way to log a value to have a better idea of what is happening?
2 years ago
0 Votes
7 Answers
659 Views
0 Votes 7 Answers 659 Views
Hi! I am trying to download data from GS using StorageManager.get_local_copy() . It works fine when I point it to a file i.e gs://bucket/dataset/image.png bu...
3 years ago
0 Votes
5 Answers
646 Views
0 Votes 5 Answers 646 Views
Hi, with the upcoming version of Hydra it seems the binding breaks. Specifically in the run_job function the argument order changed from https://github.com/f...
3 years ago
0 Votes
2 Answers
653 Views
0 Votes 2 Answers 653 Views
Hi all! Currently I am trying to create a tool that can perform certain operations on dataset ids, this is a skeleton of what I have in mind (based on the ex...
3 years ago
0 Votes
2 Answers
648 Views
0 Votes 2 Answers 648 Views
Hi ! While restarting the server I got ERROR: for agent-services removal of container 8f1d8539340d6d073eb5b51294f5f5d802048a3614d459b5c4fb1d38a05ce538 is alr...
3 years ago
0 Votes
2 Answers
634 Views
0 Votes 2 Answers 634 Views
Hi
Hi AgitatedDove14 ! Regarding the Hydra integration, which pattern should be used? Call the task inside the decorated function? Will this store the parameter...
3 years ago
0 Votes
13 Answers
607 Views
0 Votes 13 Answers 607 Views
3 years ago
0 Votes
9 Answers
740 Views
0 Votes 9 Answers 740 Views
Hi! Does ClearML have a way to turn on/off virtual machines depending if there are experiments on queue?
3 years ago
0 Votes
3 Answers
686 Views
0 Votes 3 Answers 686 Views
Hi! I am trying to run some experiments on an agent I have configured to use the requirements.txt the problem is it only shows Cython on the list of installe...
3 years ago
0 Votes
30 Answers
648 Views
0 Votes 30 Answers 648 Views
Hi! Is there something happening with the ModelCheckpoint callback on tensorflow==2.4.0 ? Using 2.2.0 gave me an input model on the artifacts tab in the GUI 😒
3 years ago
0 Votes
3 Answers
677 Views
0 Votes 3 Answers 677 Views
Hi! I have some ClearML agents on GCP and sometimes the instance seems to reboot making the experiment fail and all the progress is lost. What is the best wa...
2 years ago
0 Votes
5 Answers
652 Views
0 Votes 5 Answers 652 Views
Hi! I was taking a look at the https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html and wanted to know if anyone has used clearml wit...
2 years ago
0 Votes
6 Answers
684 Views
0 Votes 6 Answers 684 Views
Hi! I am saving some intermediate .pt files on the experiments and clearml automatically detects them as models, this makes the clearml.model - INFO message ...
2 years ago
0 Votes
2 Answers
650 Views
0 Votes 2 Answers 650 Views
Hi! What would be the way for manually uploading a model? I have intermediate .pt files which I don't want to upload. Is there a way to turn off clearml capt...
2 years ago
0 Votes
7 Answers
789 Views
0 Votes 7 Answers 789 Views
Hi! If I have a pipeline on gitlab that uses ClearML for some tests is there some way to setup the credentials so that it doesn’t fail?
3 years ago
0 Votes
2 Answers
613 Views
0 Votes 2 Answers 613 Views
Hi! I have the previous trains server configured with multiple experiments; I created it using the gcloud images provided. If I want to update the server to ...
3 years ago
0 Votes
7 Answers
721 Views
0 Votes 7 Answers 721 Views
Hi! I am currently using Hydra+ClearML and wanted to know if there are still some updates coming. At the moment, if I change the defaults hydra uses from the...
3 years ago
0 Votes
11 Answers
705 Views
0 Votes 11 Answers 705 Views
Hi! Is there a way to run a task without reporting to the server? For example if I want to debug a script by running it locally without it appearing on the s...
3 years ago
0 Votes
6 Answers
622 Views
0 Votes 6 Answers 622 Views
Hi! I have some agents on GCP. Lately I have been getting some experiments that simply stop running (no signs that the experiment crashed). Here is a plot th...
2 years ago
0 Votes
4 Answers
655 Views
0 Votes 4 Answers 655 Views
Hi! If I have a folder with multiple ckpt files would the manual way to upload them be the following: output_model = OutputModel(task) output_model.update_we...
2 years ago
0 Votes
2 Answers
608 Views
0 Votes 2 Answers 608 Views
Hi! I changed from trains to clearml and ran some experiments using keras but it seems the metrics are not being tracked automagically, has anyone ran into t...
3 years ago
0 Votes
5 Answers
725 Views
0 Votes 5 Answers 725 Views
Hi
Hi πŸ‘‹ I am logging some figures on pytorch lightning using the example here. The figures are correctly saved on Tensorboard's images tab but unfortunately ar...
2 years ago
Show more results questions
0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

Sure, I’ll share It through a private message!

3 years ago
0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

Yes, exactly! Unfortunately I am not so familiar with the internals of the library but I could take a look and figure that out.

3 years ago
0 Hi! Regarding The

Thanks for the info AgitatedDove14 !

3 years ago
0 Hi, I Was Getting A Really Weird Error Due To Mismatch On The Versions Between The Installed Libraries In My Environment And The Ones Ran In The Node (I Manually Changed The Installed Packages And Everything Worked). How Can I Force Trains To Use Exactly

No, I have all the packages with a version. I just want to know if there is a way to override the requirements versions detected by Pigar when using detect_with_pip_freeze: false . I have locally cloudpickle==1.4.1 but when running the code and sending the task to the node the environment uses cloudpickle==1.6.0 . I have to manually change the version on the UI. Is there a way to force this single package to have a version? Maybe on the requirments.txt or something similar

3 years ago
0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

AgitatedDove14 I filed an issue of fire for them to point us to the argument parsing method https://github.com/google/python-fire/issues/291

3 years ago
0 Hi! I Was Taking A Look At The

Yes AgitatedDove14 , I am not sure what they use by default. Here is a simple working example:
` from typing import Optional

import torch
from clearml import Task
from pytorch_lightning import LightningDataModule, LightningModule
from pytorch_lightning.utilities.cli import LightningCLI
from torch.utils.data import DataLoader, Dataset, Subset

class RandomDataset(Dataset):
def init(self, size, length):
self.len = length
self.data = torch.randn(length, size)

def ...
2 years ago
0 Hi! I Have Some Clearml Agents On Gcp And Sometimes The Instance Seems To Reboot Making The Experiment Fail And All The Progress Is Lost. What Is The Best Way To Resume An Experiment?

Hey CostlyOstrich36 sorry to ping you! Let's say I enqueue multiple experiments on a couple of agents and one of them fails. Is it possible to restart the experiment from the UI using the latest checkpoint? What if the experiment gets assigned to the other agent? I am not sure how the continue_last_task flag would help in this case.

2 years ago
0 Hello

It is failing exactly when the download finishes. Not sure if it is something but on the ~/.clearml/pip-download-cache only a cu120 empty folder appears. Should the torch wheel be saved there?

11 months ago
0 Hello

Sure! For torch I have:

torch==2.0.1
    # via
    #   monai
    #   pytorch-lightning
    #   torchio
    #   torchmetrics
11 months ago
0 Hello

What additional context do you need?

11 months ago
0 Hello

Yes, I configured it that way πŸ‘Œ Thanks! I'll use the flag!

11 months ago
0 Hi

Thanks!

3 years ago
3 years ago
0 Hi! I Am Trying To Download Data From Gs Using

AgitatedDove14 update here! Something like this should work:
from trains import StorageManager from trains.storage.helper import StorageHelper bucket = 'gs://bucket' helper = StorageHelper.get(bucket) remote_files = helper.list('folder') for f in remote_files: StorageManager.get_local_copy(bucket + "/" + f)the * gives [] results since one the list method startswith is used which uses it as a string and not as a wildcard

3 years ago
0 Hi! I Am Trying To Download Data From Gs Using

Yes! How can I help? AgitatedDove14

3 years ago
0 Hi, I Was Getting A Really Weird Error Due To Mismatch On The Versions Between The Installed Libraries In My Environment And The Ones Ran In The Node (I Manually Changed The Installed Packages And Everything Worked). How Can I Force Trains To Use Exactly

AgitatedDove14 I am not sure why the packages get different versions, maybe since the package is not directly imported in my code it is possible to get a different version to what I have locally (?). Should all the libraries versions match exactly between local and the code that runs in the agent? The Task.add_requirements(package_name, package_version=None) workaround works perfectly! I just add the previous version that doesn’t break the code. Yes, definitely a force flag could help ...

3 years ago
0 Hi

SuccessfulKoala55 Is the update from 1.2.0 only updating the docker-compose file?

2 years ago
0 Hi! I Am Saving Some Intermediate

So I would have to disconnect pytorch? And then upload the model at the end

2 years ago
0 Hi! I Am Saving Some Intermediate

Hi CostlyOstrich36 ! The message is the following:
clearml.model - INFO - Selected model id: 27c1a1700b0b4e25a4344dc4ef9868faThey are not models, those are intermediate tensors I am caching to make training faster. I don't need to log them.

2 years ago
Show more results compactanswers