Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
GrievingTurkey78
Moderator
34 Questions, 125 Answers
  Active since 10 January 2023
  Last activity 2 years ago

Reputation

0

Badges 1

119 × Eureka!
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi! I have the previous trains server configured with multiple experiments; I created it using the gcloud images provided. If I want to update the server to ...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi! I have some ClearML agents on GCP and sometimes the instance seems to reboot making the experiment fail and all the progress is lost. What is the best wa...
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi! Is there something happening with the ModelCheckpoint callback on tensorflow==2.4.0 ? Using 2.2.0 gave me an input model on the artifacts tab in the GUI 😒
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi all! Currently I am trying to create a tool that can perform certain operations on dataset ids, this is a skeleton of what I have in mind (based on the ex...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi
Hi πŸ‘‹ I am logging some figures on pytorch lightning using the example here. The figures are correctly saved on Tensorboard's images tab but unfortunately ar...
3 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi! I am trying to download data from GS using StorageManager.get_local_copy() . It works fine when I point it to a file i.e gs://bucket/dataset/image.png bu...
5 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi! I am currently using Hydra+ClearML and wanted to know if there are still some updates coming. At the moment, if I change the defaults hydra uses from the...
4 years ago
0 Votes
15 Answers
2K Views
0 Votes 15 Answers 2K Views
Hi
Hi πŸ‘‹ I am trying to set up a trains server on GCP. I followed all the steps listed here https://allegro.ai/docs/deploying_trains/trains_server_gcp/ . I also...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi
Hi AgitatedDove14 ! Regarding the Hydra integration, which pattern should be used? Call the task inside the decorated function? Will this store the parameter...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi! I changed from trains to clearml and ran some experiments using keras but it seems the metrics are not being tracked automagically, has anyone ran into t...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi! Regarding the artifact.get_local_copy() method, since there is no way to specify the path where the artifact will be downloaded, I wanted to confirm that...
5 years ago
0 Votes
13 Answers
2K Views
0 Votes 13 Answers 2K Views
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Quick question on the clearml-data package, Can I add files to a dataset from google storage instead of having to download them?
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi! I am having some problems with a loss after a good amount of training, what would be the best way to log a value to have a better idea of what is happening?
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
I am trying to upgrade from clearml server 0.16 to the newest version but I am getting some errors when spinning up the new containers: WiredTiger error (-31...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi! I was taking a look at the https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html and wanted to know if anyone has used clearml wit...
4 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi! If I have a pipeline on gitlab that uses ClearML for some tests is there some way to setup the credentials so that it doesn’t fail?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi! I am trying to run some experiments on an agent I have configured to use the requirements.txt the problem is it only shows Cython on the list of installe...
4 years ago
0 Votes
8 Answers
2K Views
0 Votes 8 Answers 2K Views
Hello πŸ‘‹ I am using a self hosted clearml setup using the requirments file of the project. When I run the task it is failing and I get: Collecting torch==2.0...
2 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, with the upcoming version of Hydra it seems the binding breaks. Specifically in the run_job function the argument order changed from https://github.com/f...
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi! I have some agents on GCP. Lately I have been getting some experiments that simply stop running (no signs that the experiment crashed). Here is a plot th...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi! I recently updated my server and my clearml version, now when I set a task to be executed remotely its default state is aborted hence I have to reset and...
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi! Is there a way to run a task without reporting to the server? For example if I want to debug a script by running it locally without it appearing on the s...
4 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hi! I am getting the following error on an agent: /usr/local/bin/python3.8: No module named virtualenv clearml_agent: ERROR: Command '['python3.8', '-m', 'vi...
3 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
I am also experiencing a weird behaviour when running a script using the module flag. For example I run: python -m module.script arg1 arg 2And after the scri...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi ! While restarting the server I got ERROR: for agent-services removal of container 8f1d8539340d6d073eb5b51294f5f5d802048a3614d459b5c4fb1d38a05ce538 is alr...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi! If I have a folder with multiple ckpt files would the manual way to upload them be the following: output_model = OutputModel(task) output_model.update_we...
3 years ago
0 Votes
21 Answers
2K Views
0 Votes 21 Answers 2K Views
Hi! Any idea why clearml fails to detect iteration reporting? ClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-fr...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi! What would be the way for manually uploading a model? I have intermediate .pt files which I don't want to upload. Is there a way to turn off clearml capt...
4 years ago
0 Votes
17 Answers
2K Views
0 Votes 17 Answers 2K Views
4 years ago
Show more results questions
0 I Am Also Experiencing A Weird Behaviour When Running A Script Using The Module Flag. For Example I Run:

Yes, everything is that way (work dir and args are ok) except the script path . It shows -m module arg1 arg2 .

5 years ago
0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

Yes! I think thats what I will do πŸ‘Œ Let me know if there is a way to contribute a mode to keep logging off. We just don’t want to pollute the server when debugging.

4 years ago
0 Hi! Any Idea Why Clearml Fails To Detect Iteration Reporting?

I'll give that a try! Thanks CostlyOstrich36

4 years ago
0 Hi! Any Idea Why Clearml Fails To Detect Iteration Reporting?

CostlyOstrich36 That seemed to do the job! No message after the first epoch, with the caveat of losing resource monitoring. Any idea of what could be causing this? If the resource monitor is the first plot then the iteration detection will fail? Are there any hacks to keep the resource monitoring? Thanks a lot! πŸ™Œ

4 years ago
0 Hi! I Have Some Agents On Gcp. Lately I Have Been Getting Some Experiments That Simply Stop Running (No Signs That The Experiment Crashed). Here Is A Plot That Shows The Resource Monitoring. Any Ideas On What Could Be Causing This?

Hey CostlyOstrich36 ! I am using clearml==1.1.2 and clearml-agent==1.1.0 . Stopped is not the right word, more like frozen, it just froze at an epoch. The console on the agent shows epoch 33 first batch and the one at the server epoch 32 last batch. The experiment was running for ~6 hours.

4 years ago
0 Hi! Any Idea Why Clearml Fails To Detect Iteration Reporting?

Hey CostlyOstrich36 I am doing a lot of things before the first plot is reported! Is the seconds_from_start parameter unbounded? What should I do if it takes a lot of time to report the first plot?

4 years ago
0 Hi! I Am Currently Using Hydra+Clearml And Wanted To Know If There Are Still Some Updates Coming. At The Moment, If I Change The Defaults Hydra Uses From The

Side note: When running src.train as a module the server gets the command as src and has to be modified to be src.train

4 years ago
0 Hi! I Am Getting The Following Error On An Agent:

With pip I get the first error I showed, I tried conda and it starts running but at some point crashes with:
clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'

3 years ago
0 Hi! I Am Having Some Problems With A Loss After A Good Amount Of Training, What Would Be The Best Way To Log A Value To Have A Better Idea Of What Is Happening?

AgitatedDove14 Well I have a loss function which is something like:
class MyLoss(...): def forward(...): weights = self.compute_weights(...) return (weights * (target-preds)).mean()There seems to be a problem on certain batch when computing the weights. What would be the best way to log the batch that causes the problem, along with the weights being computed.

3 years ago
0 Hi! I Am Currently Using Hydra+Clearml And Wanted To Know If There Are Still Some Updates Coming. At The Moment, If I Change The Defaults Hydra Uses From The

Sure! I enqueue the experiment from my local machine:
python -m src.train model=my_model loss=my_loss dataset=my_dataset

Then I go to the server and run the experiment and create a copy to run with a new model. On the copy, I go to the script path and modify it to be:
-m src.train model=my_other_model loss=my_loss dataset=my_dataset

The new experiment, even though the script path has my_new_model default, starts training using my_model .

I can also see ...

4 years ago
0 I Am Also Experiencing A Weird Behaviour When Running A Script Using The Module Flag. For Example I Run:

So should I set them all with a default value? The working dir is the project one, the one that contains the module package

5 years ago
0 Hi

Let me work on it πŸ‘Œ

3 years ago
0 Hi

I am using the code inside the on_train_epoch_end inside a metric. So the important part is:
` fig = plt.figure()

my plot

logger.experiment.add_figure("fig", fig)
plt.close() `

3 years ago
0 Hi

SuccessfulKoala55 Is the update from 1.2.0 only updating the docker-compose file?

3 years ago
0 Hi

The plot is generated and added to tensorboard but seems clearml is not catching it.

3 years ago
0 Hi! I Am Getting The Following Error On An Agent:

Not yet AgitatedDove14 , does the agent use by default the python version the command is run with? I installed conda and tried using package_manager.type=conda but then get an error:
clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'

3 years ago
0 Hi! Any Idea Why Clearml Fails To Detect Iteration Reporting?

I set it to 200000 ! But the problem stems from when the first plot is the clearml cpu and gpu monitoring, were you able to reproduce it? Even if I set the number fairly large when the monitoring plot was reported the message appeared.

4 years ago
5 years ago
0 Hi! Is There Something Happening With The

I changed it to point to a folder and it shows up

4 years ago
0 Hi! Is There Something Happening With The

AgitatedDove14 its on the checkpoint

4 years ago
0 Hi! Is There Something Happening With The

AgitatedDove14 Thanks! Im trying to figure out how to create a minimum working example! I am also working with Hydra so that may be a thing. The extension is whats causing it to fail (haven’t figured out why).

4 years ago
0 Hi! I Am Saving Some Intermediate

Thanks! This should work perfectly πŸ‘Œ

4 years ago
Show more results compactanswers