Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...
4 years ago
0 Votes
16 Answers
2K Views
0 Votes 16 Answers 2K Views
Hello, ~3 months ago I created a trains-server in a machine with 30gb of disk space. Today I wasn't able to connect to trains-server, so I checked the server...
4 years ago
0 Votes
8 Answers
2K Views
0 Votes 8 Answers 2K Views
Hi guys, is a Task updating its status to 'Complete' before finishing to upload its artifacts/metrics in the background?
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, there is a small bug with auto-refreshing in the DEBUG SAMPLES Tab of the Web UI: If it is ON, then it will always force the first series to be displayed...
3 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hey guys, I am setting up a new machine with two rtx 3070 GPUs where I created two agents (one for each GPU). On both agents, my experiments fail with error:...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I am getting an error while running task.mark_stopped() , any idea why? (clearml 1.0.2, clearml-agent 1.0.0, python 3.6) File "/home/machine/.clearml/ven...
4 years ago
0 Votes
22 Answers
2K Views
0 Votes 22 Answers 2K Views
Hi there, I used clearml-task to send a script to be executed remotely. When being executed remotely Task.current_task() returns None, how should I get the c...
3 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hello, Pytorch 1.8 was released, bringing AMD wheels with it > pip install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html Is ClearML s...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hey there, happy new year to all of you ๐Ÿพ I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi all, I updated from clearml-server 1.14.1 to 1.15.0 and I am getting the following error while trying to start the server after running docker-compose pul...
one year ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, how can I easily start a shell script from within an experiment and have its logs (stdin/err) logged in clearml?
3 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, is there a way to update the setup shell script via the SDK?
2 years ago
0 Votes
27 Answers
2K Views
0 Votes 27 Answers 2K Views
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi there! I have a question regarding s3 access: I created a s3 user with read/write access but not delete, and trains seems to requires delete permissions (...
5 years ago
0 Votes
27 Answers
2K Views
0 Votes 27 Answers 2K Views
Hi, similar to Task.set_offline(True), is there a way to simulate an execution in an agent? (for testing purposes)
3 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
3 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...
2 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...
5 years ago
0 Votes
17 Answers
2K Views
0 Votes 17 Answers 2K Views
Hello, I am trying to retrieve a simple dict artifact uploaded in a previous task with task.upload_artifact("my_dict", dict(foo="bar")) in a second task. I t...
5 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
3 years ago
0 Votes
19 Answers
2K Views
0 Votes 19 Answers 2K Views
Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi there, I think there is a bug with clearml sdk v0.17.5rc2: when running a task locally, the dashboard doesnt not shows the task as finished once the task ...
4 years ago
0 Votes
19 Answers
2K Views
0 Votes 19 Answers 2K Views
2 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, how does agent.enable_git_ask_pass works? I am using the clearml-agent in docker mode and my experiment is stuck at downloading a private dependency: Clo...
2 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
4 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hello there, I would like to do run cleanup code in case the user aborts one task from the dashboard (the agent is not using the task in docker). What signal...
4 years ago
0 Votes
26 Answers
2K Views
0 Votes 26 Answers 2K Views
Hi, I attached an IAM role to an ec2 instance to grant access to an s3 bucket. The ec2 instance is running a clearml-agent (v1.1.0). I didn’t specify any key...
aws
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hey there ๐Ÿ™‚ Would in the WebUI, on an experiment CONFIGURATION tab, for a specific parameter, would it be possible not show its value as a single string whe...
3 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi, in the Metric Snapshot graph, is it possible to scale the Y axis to [y_min *0.9, y_max * 1,1] ? currently all my values are flat at the same ~y and it is...
3 years ago
0 Votes
13 Answers
3K Views
0 Votes 13 Answers 3K Views
Hi, I am trying to use the clearml-agent in docker mode to run an experiment, but it seems to fail passing the clearml.conf file to the docker container: Exe...
2 years ago
Show more results questions
0 I Guess One Experiment Is Running Backwards In Time

ok, what is the 3.8 release? a server release? how does this number relates to the numbers above?

3 years ago
4 years ago
0 Hi, I Have A Question Regarding The Aws-Autoscaler: Am I Understanding Correctly That:

Why would it solve the issue? max_spin_up_time_min should be the param defining how long to wait after starting an instance, not polling_interval_time_min , right?

4 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Alright I have a followup question then: I used the param --user-folder โ€œ~/projects/my-projectโ€, but any change I do is not reflected in this folder. I guess I am in the docker space, but this folder is not linked to my the folder on the machine. Is it possible to do so?

4 years ago
0 Hi, From Within An Experiment, How Can I Intercept The Signal That The Experiment Was Aborted And Execute A Cleanup Function? I Tried To Intercept Sigint And Sigterm, Unsuccessfully:

Hi SuccessfulKoala55 , thanks for the idea! the function isnโ€™t called with atexit.register() though, maybe the way the agent kills the task is not supported by atexit

3 years ago
0 Hi, Similar To Task.Set_Offline(True), Is There A Way To Simulate An Execution In An Agent? (For Testing Purposes)

so that any error that could arise from communication with the server could be tested

3 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

So either I specify in the clearml-agent agent.python_binary: python3.8 as you suggested, or I enforce the task locally to run with python3.8 using task.data.script.binary

2 years ago
0 Hi! I Have A Question Regarding Performances Of The Clearml-Server: Are The Calls From The Agents Made Asynchronously/In A Non Blocking Separate Thread? Is The Connection To The Clearml-Server Expected To Be A Bottleneck If The Clearml-Server Is Far From

Is there one?

No, I rather wanted to understand how it worked behind the scene ๐Ÿ™‚

The latest RC (0.17.5rc6) moved all logs into separate subprocess to improve speed with pytorch dataloaders

Thatโ€™s awesome!

4 years ago
0 Hi There, Maybe This Was Already Asked But I Don'T Remember: Would It Be Possible To Have The Clearml-Agent Switch Between Docker Mode And Virtualenv Mode At Runtime, Depending On The Experiment

Yea so I assume that training my models using docker will be slightly slower so I'd like to avoid it. For the rest using docker is convenient

2 years ago
0 Hi, If I Am Starting My Training With The Following Command:

AgitatedDove14 yes! I now realise that the ignite events callbacks seem to not be fired (I tried to print a debug message on a custom Events.ITERATION_COMPLETED) and I cannot see it logged

3 years ago
0 Hi, I Have Another Problem

OK but nowhere I specified that, I just checked my trains.conf file

5 years ago
0 Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

Unfortunately this is difficult to reproduce... Neverthless it would be important to me to be robust against it, because if this error happens in a task in the middle of my pipeline, the whole process fails.

This binds to another wider topic I think: How to "skip" tasks if they already run (a mechanism similar to what [ https://luigi.readthedocs.io/en/stable/ ] offers). That would allow to restart the pipeline and skip tasks until the point where the task failed

5 years ago
0 Hi, I Have Another Problem

I just started one and it wrote:
...

5 years ago
0 Hi, I Have Another Problem

I specified a torch @ https://download.pytorch.org/whl/cu100/torch-1.3.1%2Bcu100-cp36-cp36m-linux_x86_64.whl and it didn't detect the link, it tried to install latest version: 1.6.0

5 years ago
0 Hi, I Have Another Problem

I don't know why it didn't detect it in first place

5 years ago
0 Hi, I Have Another Problem

btw shoulnd't it be CUDA_VERSION=11.0 ?

5 years ago
0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

I also discovered https://h2oai.github.io/wave/ last week, would be awesome to be able to deploy it in the same manner

4 years ago
0 Hey Guys, I Am Setting Up A New Machine With Two Rtx 3070 Gpus Where I Created Two Agents (One For Each Gpu). On Both Agents, My Experiments Fail With Error:

Also, from https://lambdalabs.com/blog/install-tensorflow-and-pytorch-on-rtx-30-series/ :

As of 11/6/2020, you can't pip/conda install a TensorFlow or PyTorch version that runs on NVIDIA's RTX 30 series GPUs (Ampere). These GPUs require CUDA 11.1, and the current TensorFlow/PyTorch releases aren't built against CUDA 11.1. Right now, getting these libraries to work with 30XX GPUs requires manual compilation or NVIDIA docker containers.

But what wheel is downloading trains in that case?

4 years ago
0 Hi, I Have Another Problem

ho, that might be it then, thanks!

5 years ago
0 Hi, I Have Another Problem

thanks, I will do that

5 years ago
0 Congrats On The Clearml-Serving 0.9.0 Release! I’Ll Try It For Sure!

This is HUGE ๐Ÿ”ฅ ๐Ÿš€ ๐ŸŽ‰

3 years ago
Show more results compactanswers