Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 8 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
12 Answers
1K Views
0 Votes 12 Answers 1K Views
Hi there! Is there an easy way to retrieve the site-package directory that was created by an agent from inside a task? Eg. task = Task.init(...) task.add_req...
2 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...
3 years ago
0 Votes
18 Answers
1K Views
0 Votes 18 Answers 1K Views
Hey there, I would like to increase the ulimit for the number of files opened at the same time in a ec2 instance. According to this https://stackoverflow.com...
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hey, I have one question regarding the cleanup_service task in the DevOps project: Does it assume that the agent in services mode is in the trains-server mac...
4 years ago
0 Votes
2 Answers
969 Views
0 Votes 2 Answers 969 Views
Hi, in the AWS AutoScaler, I am getting the following warning: Warning! exception occurred: APIError: code 400/1004: Worker is not registered: worker=aws:A10...
3 years ago
0 Votes
20 Answers
1K Views
0 Votes 20 Answers 1K Views
Is it possible to run an agent, listen to the services queue without using docker?
4 years ago
0 Votes
7 Answers
987 Views
0 Votes 7 Answers 987 Views
Hi, is there a way to get some stats about the use of workers? I would like to know, over the past 3 months: Number of training hours per user Number of trai...
3 years ago
0 Votes
17 Answers
1K Views
0 Votes 17 Answers 1K Views
Hi there, I have a problem with PyJWT: I am using trains==0.16.4 and trains-agent==0.16.3 in my agents. I installed PyJWT==1.7.1 in the agent (through extra_...
3 years ago
0 Votes
17 Answers
1K Views
0 Votes 17 Answers 1K Views
3 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Are the various task types available in 0.15? I am getting > 2020-06-09 12:58:53,287 - trains.Task - WARNING - Retrying, previous request failed : 'custom' i...
4 years ago
0 Votes
12 Answers
972 Views
0 Votes 12 Answers 972 Views
Hi, I encounter a weird behavior: I have a task A that schedules a task B. Task B is executed on an agent, but with an old commit šŸ¤” although the branch is p...
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Hi all, Would it be possible to make the aws autoscaler log each scale in/out operation in the console to help debugging/understanding the course of events?
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi ClearML team members! Is there any progress made on the clearml-serving repo? I’d love to start using it but I lack a straightforward get started example....
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi, I would like to use pytorch3d==0.5.0 with torch==1.9.1 on cuda version 110, locally it works, but the clearml agent fails setting up the environment with...
3 years ago
0 Votes
22 Answers
1K Views
0 Votes 22 Answers 1K Views
Hi, I would like to switch from the elastic-search service in the docker-compose of the clearml-server to an externally managed, scalable elastic-search clus...
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hi, are the experiments logs stored in s3 or in the trains-server? (When using s3 as artifact storage)
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi, in one of my agents with CUDA Version: 11.1 (from nvidia-smi), clearml agent 0.17.1 detects version 100 (I can see from experiments logs: agent.cuda_vers...
3 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi, a small bug (not really a bug) in the autoscaler: I have p3.2xlarge instances that take a long time to shutdown. With polling_interval_time_min=1 , the a...
3 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
Hi, I deleted all archived experiments in a project and I just realized all experiments of all projects were deleted (clearml server v1.0.0) šŸ¤”
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
aws
2 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi, I see that there is a new parameter in aws autoscaler: max_spin_up_time_min - What is the difference with max_idle_time_min ?
aws
3 years ago
0 Votes
4 Answers
927 Views
0 Votes 4 Answers 927 Views
Is there a way to report a simple series with X and Y coords, X and Y being two lists of same length?
4 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, how does agent.enable_git_ask_pass works? I am using the clearml-agent in docker mode and my experiment is stuck at downloading a private dependency: Clo...
2 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi there, any plan/benefit to support virtualenv= 20 ?
4 years ago
0 Votes
16 Answers
1K Views
0 Votes 16 Answers 1K Views
Hello, ~3 months ago I created a trains-server in a machine with 30gb of disk space. Today I wasn't able to connect to trains-server, so I checked the server...
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Another one: What is the difference between Task.connect() and Task.set_parameter?
4 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, is it possible to specify the required version of python for a Task that is different from the python running the clearml-agent? Example: my clearml-agen...
2 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hi, I have a local package that I use to train my models. To start training, I have a script that calls task._update_requirements([".", "torch==1.11.0"]) . I...
2 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi, is clearml-server compatible with latest versions of ES ( > 7.6.2)?
3 years ago
Show more results questions
0 Hey, Often I Want To Compare Scalars Of Two Experiments With The Same Name But With Different Tags. In The Scalars Comparison Tab, I Cannot See Which Experiment Is Which Because I Don’T See The Tags. Usually, I Rename The Experiments So That I Can Identif

Usually one or two tags, indeed, task ids are not so convenient, but only because they are not displayed in the page, so I have to go back to another page to check the ID of each experiment. Maybe just showing the ID of each experiment in the SCALAR page would already be great, wdyt?

3 years ago
0 Hi, I Attached An Iam Role To An Ec2 Instance To Grant Access To An S3 Bucket. The Ec2 Instance Is Running A Clearml-Agent (V1.1.0). I Didn’T Specify Any Key/Secret For Clearml. The Tasks Fail With The Following Error:

There is no need to add creds on the machine, since the EC2 instance has an attached IAM profile that grants access to s3. Boto3 is able retrieve the files from the s3 bucket

3 years ago
0 Hi, How Can I Change The Project.Default_Output_Destination? I Tried Setting It To None But It Is Not Updated

then print(Task.get_project_object().default_output_destination) is still the old value

2 years ago
0 Hi, In The Metric Snapshot Section Of The Overview Tab Of A Project Page, Would It Be Possible To:

no it doesn't! 3. They select any point that is an improvement over time

2 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Same, it also returns a ProxyDictPostWrite , which is not supported by OmegaConf.create

2 years ago
0 Hi, I Recently Updated Clearml-Server To 1.7 And I Am Getting A Lot Of The Following Errors Since Today On Any Experiment (I Didn'T Had This Error Before):

To be fully transparent, I did a manual reindexing of the whole ES DB one year ago after it run out of space, at that point I might have changed the mapping to strict, but I am not sure. Could you please confirm that the mapping is correct?

2 years ago
0 Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Yes, I would like to update all references to the old bucket unfortunatelyā€¦ I think Iā€™ll simply delete the old s3 bucket, wait or his name to be available again and recreate it where on the other aws account and move the data there. This way I donā€™t have to mess with clearml data - I am afraid to do something wrong and loose data

3 years ago
0 Hi, I Cannot Manage To Start Trains-Server 0.16 With The Docker-Compose File, The Trains-Elastic Container Fails With The Following Error:

Yes I did, I found the problem: docker-compose was using trains-server 0.15 because it didn't see the new version of trains-server. Hence I had trains-server 0.15 running with ES7.
-> I deleted all the containers and it successfully pulled trains-server 0.16. Now everything is running properly šŸ™‚

4 years ago
3 years ago
0 Hi, In The Context Of Multi-Gpu Training, Is

if I want to resume a training on multi gpu, I will need to call this function on each process to send the weights to each gpu

3 years ago
0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

Yes it would be very valuable to be able to tweak that param, currently it's quite annoying because it's set to 30 mins, so when a worker is killed by the autoscaler, I have to wait 30 mins before the autoscaler spins up a new machine because the autoscaler thinks there is already enough agents available, while in reality the agent is down

one year ago
0 Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

mmmmh I just restarted the experiment and it seems to work now. I am not sure why that happened. From this SO it could be related to size of the repo. Might be a good idea to clone with --depth 1 in the agents?
Or more generally, try to catch this error and retry a few times?

4 years ago
2 years ago
0 Hi, I Attached An Iam Role To An Ec2 Instance To Grant Access To An S3 Bucket. The Ec2 Instance Is Running A Clearml-Agent (V1.1.0). I Didn’T Specify Any Key/Secret For Clearml. The Tasks Fail With The Following Error:

I am confused now because I see in the master branch, the clearml.conf file has the following section:
# Or enable credentials chain to let Boto3 pick the right credentials. # This includes picking credentials from environment variables, # credential file and IAM role using metadata service. # Refer to the latest Boto3 docs use_credentials_chain: falseSo it states that IAM role using metadata service should be supported, right?

3 years ago
0 Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Thanks a lot for the solution SuccessfulKoala55 ! Iā€™ll try that if the solution ā€œdelete old bucket, wait for its name to be available, recreate it with the other aws account, transfer the data backā€ fails

3 years ago
0 Hi There, I Have A Problem With Pyjwt: I Am Using

I can ssh into the agent and:
source /trains-agent-venv/bin/activate (trains_agent_venv) pip show pyjwt Version: 1.7.1

3 years ago
0 Hey, I Have A Problem With The Following Task:

Thanks for the explanations,
Yes that was the case This is also what I would think, although I double checked yesterday:I create a task on my local machine with trains 0.16.2rc0 This task calls task.execute_remotely() The task is sent to an agent running with 0.16 The agent install trains 0.16.2rc0 The agent runs the task, clones it and enqueues the cloned task The cloned task fails because it has no hyper-parameters/args section (I can seen that in the UI) When I clone the task manually usin...

4 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

should I try to roll back to clearml-server 1.0.2? I am very anxious nowā€¦

3 years ago
0 Hey There, I Would Like To Increase The

it actually looks like I donā€™t need such a high number of files opened at the same time

3 years ago
0 Hi, Together With

with the RC version

4 years ago
Show more results compactanswers