Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
2 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, I have a question about https://clear.ml/docs/latest/docs/references/sdk/logger#report_scatter3d : Would it be possible to pass a matplotlib figure in 3d...
2 years ago
0 Votes
12 Answers
1K Views
0 Votes 12 Answers 1K Views
3 years ago
0 Votes
29 Answers
1K Views
0 Votes 29 Answers 1K Views
Hi, although https://github.com/allegroai/clearml/issues/181 is resolved, clearml-agent (0.17.2) still logs tqdm iterations as different lines, is there some...
3 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
2 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hi, how can I change the project.default_output_destination? I tried setting it to None but it is not updated
2 years ago
0 Votes
4 Answers
960 Views
0 Votes 4 Answers 960 Views
Hey there, happy new year to all of you šŸ¾ I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
3 years ago
0 Votes
11 Answers
976 Views
0 Votes 11 Answers 976 Views
Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...
4 years ago
0 Votes
5 Answers
955 Views
0 Votes 5 Answers 955 Views
Hello, I have a small question regarding UI: Currently, in the artifacts section of a task, the FILE PATH displayed for artifacts stored in s3 are displayed ...
4 years ago
0 Votes
30 Answers
997 Views
0 Votes 30 Answers 997 Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
3 years ago
0 Votes
2 Answers
992 Views
0 Votes 2 Answers 992 Views
Are the env variables passed to trains-agent available in experiments run by this trains-agent?
4 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, I am using the aws autoscaler and getting the following error while trying to spin up spot instances: 2021-08-16 17:18:48 Spinning new instance type=v100...
3 years ago
0 Votes
17 Answers
968 Views
0 Votes 17 Answers 968 Views
Hi there, I have a problem with PyJWT: I am using trains==0.16.4 and trains-agent==0.16.3 in my agents. I installed PyJWT==1.7.1 in the agent (through extra_...
3 years ago
0 Votes
20 Answers
1K Views
0 Votes 20 Answers 1K Views
Hello, I have an error while installing git dependencies of local package: So far I used task. update _requirements(“[.]“) with my local package referencing ...
3 years ago
0 Votes
3 Answers
976 Views
0 Votes 3 Answers 976 Views
Hi guys, since I am done with implementing the AWS autoscaler, I would like to share some pain points that I encountered in the process with the hope that th...
aws
3 years ago
0 Votes
16 Answers
1K Views
0 Votes 16 Answers 1K Views
Hello, ~3 months ago I created a trains-server in a machine with 30gb of disk space. Today I wasn't able to connect to trains-server, so I checked the server...
4 years ago
0 Votes
27 Answers
1K Views
0 Votes 27 Answers 1K Views
Hi there, I found a memory leak in Logger.report_matplotlib_figure . I was constantly running out of memory when training my models so I decided to spend som...
one year ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
3 years ago
0 Votes
5 Answers
969 Views
0 Votes 5 Answers 969 Views
Hi, is it possible to disable some of the system metrics monitored? and also downsample the rate of logging?
3 years ago
0 Votes
26 Answers
1K Views
0 Votes 26 Answers 1K Views
Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...
2 years ago
0 Votes
12 Answers
1K Views
0 Votes 12 Answers 1K Views
Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...
one year ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
3 years ago
0 Votes
16 Answers
1K Views
0 Votes 16 Answers 1K Views
Got some errors while running migration script from ES5 to ES7: 2020-08-11 15:21:50,130 Running on: Linux 2020-08-11 15:21:50,227 Docker allocated memory: 16...
4 years ago
0 Votes
7 Answers
999 Views
0 Votes 7 Answers 999 Views
Hi, one more question: When creating a task with Task.init(), we can specify the task_type . Now when using Task.clone(), we cannot specify the task_type (is...
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Hello, Pytorch 1.8 was released, bringing AMD wheels with it > pip install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html Is ClearML s...
3 years ago
0 Votes
1 Answers
994 Views
0 Votes 1 Answers 994 Views
Hi, I encounter the following bug with clearml 0.17.5rc2: When I start a task locally and that task raises cuda out of memory, the command returns but the pr...
3 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Congrats on the clearml-serving 0.9.0 release! I’ll try it for sure!
2 years ago
0 Votes
9 Answers
1K Views
0 Votes 9 Answers 1K Views
Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
aws
3 years ago
Show more results questions
0 Hey Again

Very cool! Run two train-agent daemons, one per GPU on the same machine, with default Nvidia/CUDA Docker This is close to my use case, I just would like to run these two daemons not with docker, would that be possible? I should just remove the --docker nvidia/cuda param right?

4 years ago
0 Hey Again

trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 1 --queue default &

4 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Ho the object is actually available in previous_task.artifacts

4 years ago
0 Hi, Did Anyone Experiment With Running On The Aws Autoscaler On Spots And Knows Whether There Is Configuration For Retry Policy When Spot Get Evacuated Mid-Job?

Hi there, yes I was able to make it work with some glue code:
Save your model, optimizer, scheduler every epoch Have a separate thread that periodically pulls the instance metadata and check if the instance is marked for stop, in this case, add a custom tag eg. TO_RESUME Have a services that periodically pulls failed experiments from the queue with the tag TO_RESUME, force marking them as stopped instead of failed and reschedule them with as extra-param the last checkpoint

3 years ago
0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

So two possible cases for trains-agent-1: either:
It picks a new experiment -> show randomly one of the two experiments in the "workers" tab no new experiment in default queue to start -> show randomly no experiment or the one that it is running

4 years ago
0 Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Thanks a lot for the solution SuccessfulKoala55 ! Iā€™ll try that if the solution ā€œdelete old bucket, wait for its name to be available, recreate it with the other aws account, transfer the data backā€ fails

3 years ago
0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

When an experiment on trains-agent-1 is finished, I see randomly no experiment/long experiment and when two experiments are running, I see randomly one of the two experiments

4 years ago
4 years ago
0 Hi There,

Hi @<1523701205467926528:profile|AgitatedDove14> @<1537605940121964544:profile|EnthusiasticShrimp49> , the issue above seemed to be the memory leak and it looks like there is no problem from clearml side.
I trained successfully without mem leak with num_workers=0 and I am now testing with num_workers=8.
Sorry for the false positive :man-bowing:

one year ago
0 Hi There,

I think that somehow somewhere a reference to the figure is still living, so plt.close("all") and gc cannot free the figure and it ends up accumulating. I don't know where yet

one year ago
0 Hi There,

Is it exactly agg or something different?

one year ago
0 Hi There,

Early debugging signals show that auto_connect_frameworks={'matplotlib': False, 'joblib': False} seem to have a positive impact - it is running now, I will confirm in a bit

one year ago
0 Hi, I Would Like To Follow-Up In This

I am happy if I can be of any help to fix that šŸ˜„

2 years ago
Show more results compactanswers