Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hi, I have a local package that I use to train my models. To start training, I have a script that calls task._update_requirements([".", "torch==1.11.0"]) . I...
2 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, in the "Choose compared experiments" view of the WebUI, would it be possible to add a toggle to include archived experiments in the results of the search...
2 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, is there a way to update the setup shell script via the SDK?
2 years ago
0 Votes
1 Answers
948 Views
0 Votes 1 Answers 948 Views
Hi there, would it be possible to add some Neural Architecture Search example, as for the HyperParameter Optimizer examples?
3 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
4 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi, is clearml-server compatible with latest versions of ES ( > 7.6.2)?
3 years ago
0 Votes
14 Answers
1K Views
0 Votes 14 Answers 1K Views
3 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
3 years ago
0 Votes
4 Answers
710 Views
0 Votes 4 Answers 710 Views
Hi all, I updated from clearml-server 1.14.1 to 1.15.0 and I am getting the following error while trying to start the server after running docker-compose pul...
8 months ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
Hi, I would like to create backups of my trains-server periodically. I was thinking about creating a service task under the devops project. The backup task w...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi, from within an experiment, how can I intercept the signal that the experiment was aborted and execute a cleanup function? I tried to intercept SIGINT and...
2 years ago
0 Votes
18 Answers
1K Views
0 Votes 18 Answers 1K Views
Hi, kudos for the 0.15 guys! I am having an issue related to git auth: I have an issue with trains-agent (0.15): it does not use git creds while trying to cl...
4 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi there, I would like to report a bug with the resizing of the columns in the projects view: it doesn’t work as expected. Please look at the behavior of the...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi, is it possible to pass environment variables to agents created by the AWS AutoScaler service?
4 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hey there, since which version, clearml stops connecting to the demo server by default?
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hi, what happens exactly when I execute the following command: trains-agent daemon --gpus 0 --queue default &In my code, how to know which GPU to choose insi...
4 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
Hi, one more question: When creating a task with Task.init(), we can specify the task_type . Now when using Task.clone(), we cannot specify the task_type (is...
4 years ago
0 Votes
5 Answers
999 Views
0 Votes 5 Answers 999 Views
Hi, is it possible to disable some of the system metrics monitored? and also downsample the rate of logging?
3 years ago
0 Votes
3 Answers
970 Views
0 Votes 3 Answers 970 Views
Hey there, I see that in the autoscaler configuration, the queues param accept dictionaries with values of type list of lists (see eg below.) What does it me...
3 years ago
0 Votes
2 Answers
956 Views
0 Votes 2 Answers 956 Views
Hey there 🙂 Still my journey to deploy the aws-autoscaler with spot instances, I have another question: I would like to limit the amount of time spent setti...
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi ClearML team members! Is there any progress made on the clearml-serving repo? I’d love to start using it but I lack a straightforward get started example....
3 years ago
0 Votes
12 Answers
1K Views
0 Votes 12 Answers 1K Views
3 years ago
0 Votes
26 Answers
1K Views
0 Votes 26 Answers 1K Views
Hi, I attached an IAM role to an ec2 instance to grant access to an s3 bucket. The ec2 instance is running a clearml-agent (v1.1.0). I didn’t specify any key...
aws
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hello, is it possible for the clearml-agent in docker mode to not pull a specific docker image, but to build one from the experiment repository using the Doc...
2 years ago
0 Votes
5 Answers
948 Views
0 Votes 5 Answers 948 Views
Hey again 😁 I am migrating my trains-server to AWS and I would like now to have secure accounts (with password). But I don't want to loose the current users...
4 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hey there, happy new year to all of you 🍾 I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
How can I filter out archived tasks with Task.get_tasks?
3 years ago
Show more results questions
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Thanks for the hack! The use case is the following: I have a controler that creates training/validation/testing tasks by cloning (so that the parent task id is properly set to the controler). Otherwise I could simply create these tasks with Task.init, but then I would need to set manually the parent task for each one of these tasks, probably with a similar hack, right?

4 years ago
0 Hi Everyone, Now I Am Evaluating Clearml. I Have A Question About How To Handle Datasets. Does Clearml Provide Any Function To Manage Datasets? Or Do We Need To Manage Them By Ourselves? In Our Usecase, We Update Datasets Little By Little Over Days Or W

This is no coincidence - Any data versioning tool you will find are somehow close to how git works (dvc, etc.) since they aim to solve a similar problem. In the end, datasets are just files.
Where clearml-data stands out imo is the straightfoward CLI combined with the Pythonic API that allows you to register/retrieve datasets very easily

3 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

AgitatedDove14 I made some progress:
In clearml.conf of the agent, I set: sdk.development.report_use_subprocess = false (because I had the feeling that Task._report_subprocess_enabled = False wasn’t taken into account) I’ve set task.set_initial_iteration(0) Now I was able to get the followin graph after resuming -

3 years ago
0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

Yes it would be very valuable to be able to tweak that param, currently it's quite annoying because it's set to 30 mins, so when a worker is killed by the autoscaler, I have to wait 30 mins before the autoscaler spins up a new machine because the autoscaler thinks there is already enough agents available, while in reality the agent is down

one year ago
0 I Guess One Experiment Is Running Backwards In Time

haa got it, I am on a self hosted server, that’s why I don’t see it

2 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Now I am trying to restart the cluster with docker-compose and specifying the last volume, how can I do that?

3 years ago
0 Hi, Together With

Which commit corresponds to RC version? So far we tested with latest commit on master (9a7850b23d2b0e1f2098ab051de58ce806143fff)

4 years ago
0 Hi, Together With

Alright, experiment finished properly (all models uploaded). I will restart it to check again, but seems like the bug was introduced after that

4 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Hi SmugDolphin23 thanks for the input! Will try now but that seems hacky: to have it working I have to specify python3.8 two times:
one in the agent config file (agent.default_python is already python3.8, but seems to be ignored) + make sure it is available (using python:3.8 docker image)Is there a way to prevent this redundancy? Ie. If I want to change the python version, I can control it from a single place?

2 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion

2 years ago
3 years ago
0 Hi, Although

SuccessfulKoala55 Am I doing/saying something wrong regarding the problem of flushing every 5 secs (See my previous message)

3 years ago
0 Hi, Although

TimelyPenguin76 clearml 0.17.5 and clearml-agent 0.17.2

3 years ago
0 Hi There,

Is it exactly agg or something different?

one year ago
0 Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Thanks SuccessfulKoala55 !
Maybe you could add to your docker-compose file an option for limiting the size of the logs, since there is no limit by default, their size will grow for ever, which doesn't sound ideal https://docs.docker.com/compose/compose-file/#logging

4 years ago
3 years ago
0 Hey There, I Would Like To Increase The

that works from within the ssh session

3 years ago
0 Announcing Clearml 0.17.5 Features

We would be super happy to have the possibility of documenting experiments (new tab in experiments UI) with a markdown editor!

3 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Does the agent install the nvidia-container toolkit, so that GPUs of the instance can be accessed from inside the docker running jupyterlab?

2 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

I understand, but then why the docker mode is an option of the CLI if we always have to use it so that it works?

2 years ago
0 Hi, In The Aws Autoscaler, Is It Possible To Specify Multiple Regions (Availability_Zone)? I Currently Use Eu-West-1A, And Would Like To Start Using Eu-West-1B And Eu-West-1C. I Tried Specifying A List In Availability_Zone Parameter, But Without Success:

yea I just realized that you would also need to specify different subnets, etc… not sure how easy it is 😞 But it would be very valuable, on-demand GPU instances are so hard to spin up nowadays in aws 😄

3 years ago
0 Hi, If I Am Starting My Training With The Following Command:

The main issue is the task_logger.report_scalar() not reporting the scalars

3 years ago
0 Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Sure yes! As you can see I just added the block
logging: driver: "json-file" options: max-size: "200k" max-file: "10"To all services. Also in this docker-compose I removed the external binding of the ports for mongo/redis/es

4 years ago
Show more results compactanswers