Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
213 Questions, 1020 Answers
  Active since 10 January 2023
  Last activity 11 days ago

Reputation

0

Badges 1

978 × Eureka!
0 Votes
5 Answers
585 Views
0 Votes 5 Answers 585 Views
Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...
3 years ago
0 Votes
1 Answers
529 Views
0 Votes 1 Answers 529 Views
Hi, I have a question about https://clear.ml/docs/latest/docs/references/sdk/logger#report_scatter3d : Would it be possible to pass a matplotlib figure in 3d...
2 years ago
0 Votes
16 Answers
544 Views
0 Votes 16 Answers 544 Views
Got some errors while running migration script from ES5 to ES7: 2020-08-11 15:21:50,130 Running on: Linux 2020-08-11 15:21:50,227 Docker allocated memory: 16...
3 years ago
0 Votes
18 Answers
541 Views
0 Votes 18 Answers 541 Views
Hey there, I would like to increase the ulimit for the number of files opened at the same time in a ec2 instance. According to this https://stackoverflow.com...
3 years ago
0 Votes
17 Answers
575 Views
0 Votes 17 Answers 575 Views
Hello, I am trying to retrieve a simple dict artifact uploaded in a previous task with task.upload_artifact("my_dict", dict(foo="bar")) in a second task. I t...
3 years ago
0 Votes
3 Answers
572 Views
0 Votes 3 Answers 572 Views
Hey guys, quick question: is there a tool function to know if a task id is valid? Not verifying that the task itself exists, just that the task id is the cor...
3 years ago
0 Votes
10 Answers
486 Views
0 Votes 10 Answers 486 Views
Hi, just want to report a small bug in the clearml dashboard: after queuing an experiment, if I change the experiment queue, then go back to the experiment I...
2 years ago
0 Votes
1 Answers
622 Views
0 Votes 1 Answers 622 Views
Hi, how can I easily start a shell script from within an experiment and have its logs (stdin/err) logged in clearml?
2 years ago
0 Votes
13 Answers
958 Views
0 Votes 13 Answers 958 Views
Hi, I am trying to use the clearml-agent in docker mode to run an experiment, but it seems to fail passing the clearml.conf file to the docker container: Exe...
one year ago
0 Votes
30 Answers
735 Views
0 Votes 30 Answers 735 Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
3 years ago
0 Votes
12 Answers
669 Views
0 Votes 12 Answers 669 Views
Hi there! Is there an easy way to retrieve the site-package directory that was created by an agent from inside a task? Eg. task = Task.init(...) task.add_req...
one year ago
0 Votes
6 Answers
618 Views
0 Votes 6 Answers 618 Views
one year ago
0 Votes
4 Answers
506 Views
0 Votes 4 Answers 506 Views
Hey, I would like my experiment to call at some point a CLI program installed as a dependency of the experiment. Here is what I do: myTask = Task.init(...) i...
3 years ago
0 Votes
5 Answers
642 Views
0 Votes 5 Answers 642 Views
one year ago
0 Votes
5 Answers
625 Views
0 Votes 5 Answers 625 Views
Hi, I would like to use pytorch3d==0.5.0 with torch==1.9.1 on cuda version 110, locally it works, but the clearml agent fails setting up the environment with...
2 years ago
0 Votes
12 Answers
550 Views
0 Votes 12 Answers 550 Views
Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...
11 months ago
0 Votes
3 Answers
535 Views
0 Votes 3 Answers 535 Views
Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
2 years ago
0 Votes
3 Answers
559 Views
0 Votes 3 Answers 559 Views
Hi guys, since I am done with implementing the AWS autoscaler, I would like to share some pain points that I encountered in the process with the hope that th...
aws
3 years ago
0 Votes
7 Answers
550 Views
0 Votes 7 Answers 550 Views
Hi, I think there is a small bug in the Experiment running time column of the workers-and-queues/workers page: they do not match the time reported in the exp...
2 years ago
0 Votes
4 Answers
516 Views
0 Votes 4 Answers 516 Views
Hey there, happy new year to all of you ๐Ÿพ I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
3 years ago
0 Votes
27 Answers
510 Views
0 Votes 27 Answers 510 Views
Hi there, I found a memory leak in Logger.report_matplotlib_figure . I was constantly running out of memory when training my models so I decided to spend som...
11 months ago
0 Votes
4 Answers
564 Views
0 Votes 4 Answers 564 Views
Hey again ๐Ÿ˜ Is it possible to run multiple agents on the same machine? And with some in services mode?
3 years ago
0 Votes
23 Answers
582 Views
0 Votes 23 Answers 582 Views
Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...
3 years ago
0 Votes
4 Answers
77 Views
0 Votes 4 Answers 77 Views
Hi all, I updated from clearml-server 1.14.1 to 1.15.0 and I am getting the following error while trying to start the server after running docker-compose pul...
11 days ago
0 Votes
2 Answers
499 Views
0 Votes 2 Answers 499 Views
Hey there ๐Ÿ™‚ Still my journey to deploy the aws-autoscaler with spot instances, I have another question: I would like to limit the amount of time spent setti...
2 years ago
0 Votes
5 Answers
588 Views
0 Votes 5 Answers 588 Views
3 years ago
0 Votes
13 Answers
618 Views
0 Votes 13 Answers 618 Views
2 years ago
0 Votes
27 Answers
574 Views
0 Votes 27 Answers 574 Views
3 years ago
0 Votes
29 Answers
563 Views
0 Votes 29 Answers 563 Views
Hi, although https://github.com/allegroai/clearml/issues/181 is resolved, clearml-agent (0.17.2) still logs tqdm iterations as different lines, is there some...
2 years ago
0 Votes
2 Answers
590 Views
0 Votes 2 Answers 590 Views
Hey there again, I am not sure to understand what is the difference between StorageManager and StorageHelper and which one to use?
3 years ago
Show more results questions
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

Interesting idea! (I assume for reporting only, not configuration)

Yes for reporting only - Also to understand which version is used by the agent to define the torch wheel downloaded

regrading the cuda check with

nvcc

, I'm not saying this is a perfect solution, I just mentioned that this is how this is currently done.
I'm actually not sure if there is an easy way to get it from nvidia-smi interface, worth checking though ...

Ok, but when nvcc is not ava...

3 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

and with this setup I can use GPU without any problem, meaning that the wheel does contain the cuda runtime

3 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

AgitatedDove14 According to the dependency order you shared, the original message of this thread isn't solved: the agent mentionned used output from nvcc (2) before checking the nvidia driver version (1)

3 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

thanks for clarifying! Maybe this could be clarified in the agent logs of the experiments with something like the following?
agent.cuda_driver_version = ... agent.cuda_runtime_version = ...

3 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

But I can do:
` $ python

import torch
torch.cuda.is_available()
True
torch.backends.cudnn.version()
8005 `

3 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

From my experience, I only installed cuda drivers on my machines. I didn't used conda to install torch nor cudatoolkit, I just let clearml-agent download the torch wheel file and install it

3 years ago
0 Got Some Errors While Running Migration Script From Es5 To Es7:

AppetizingMouse58 After some thoughts, we decided to install from scratch 0.16, with no data migration, because we believe this was an edge case not worth spending efforts on. Thank you very much for your help there, very appreciated. You guys rock! ๐Ÿ™‚

3 years ago
0 Got Some Errors While Running Migration Script From Es5 To Es7:

sure, will be happy to debug that ๐Ÿ™‚

3 years ago
0 Got Some Errors While Running Migration Script From Es5 To Es7:

Thanks! Unfortunately still not working, here is the log file:

3 years ago
3 years ago
0 Got Some Errors While Running Migration Script From Es5 To Es7:

I should also rename /opt/trains/data/elastic_migrated_2020-08-11_15-27-05 folder to /opt/trains/data/elastic before running the migration tool right?

3 years ago
0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

CostlyOstrich36 , actually this only happens for a single agent. The weird thing is that I have a machine with two gpus, and I spawn two agents, one per gpus. Both have the same version. For one, I can see all the logs, but not for the other

2 years ago
0 Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

SuccessfulKoala55 Thanks! If I understood correctly, setting index.number_of_shards = 2 (instead of 1) would create a second shard for the large index, splitting it into two shards? This https://stackoverflow.com/a/32256100 seems to say that itโ€™s not possible to change this value after the index creation, is it true?

2 years ago
0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

Hi CostlyOstrich36 , one more observation: it looks like when I donโ€™t open the experiment in the webUI before it is finished, then I get all the logs correctly. It is when I open the experiment in the webUI while it is running that I donโ€™t see all the logs.
So it looks like there is an effect of caching (the logs are retrieved only once, when I open the experiment for the first time), and not afterwards (or rarely). Is that possible?

2 years ago
2 years ago
2 years ago
0 Hi Clearml Team Members! Is There Any Progress Made On The Clearml-Serving Repo? I’D Love To Start Using It But I Lack A Straightforward Get Started Example. My Use Case Is The Following:

Hi AgitatedDove14 , thatโ€™s super exciting news! ๐Ÿคฉ ๐Ÿš€
Regarding the two outstanding points:
In my case, Iโ€™d maintain a client python package that takes care of the pre/post processing of each request, so that I only send the raw data to the inference service and I post process the raw output of the model returned by the inference service. But I understand why it might be desirable for the users to have these steps happening on the server. What is challenging in this context? Defining how t...

2 years ago
Show more results compactanswers