Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
Hi, I deleted all archived experiments in a project and I just realized all experiments of all projects were deleted (clearml server v1.0.0) 🤔
3 years ago
0 Votes
9 Answers
1K Views
0 Votes 9 Answers 1K Views
Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...
3 years ago
0 Votes
0 Answers
959 Views
0 Votes 0 Answers 959 Views
(sorry I pinned the message accidentally 😅 )
4 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...
4 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
Hi guys, is a Task updating its status to 'Complete' before finishing to upload its artifacts/metrics in the background?
4 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, I cannot manage to start trains-server 0.16 with the docker-compose file, the trains-elastic container fails with the following error:
4 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi there, any plan/benefit to support virtualenv= 20 ?
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Hi, I encountered a bug on clearml-server 1.0.1: I tried to add in a project page a custom column in +HYPER PARAMETERS > Args > queue and got an error pop up...
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hi guys, I got a very unexpected error today on in one of my agents: ... Collecting tqdm Using cached tqdm-4.48.2-py2.py3-none-any.whl (68 kB) Processing /ro...
4 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi there 🙂 Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...
4 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, I am using the aws autoscaler and getting the following error while trying to spin up spot instances: 2021-08-16 17:18:48 Spinning new instance type=v100...
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, I am giving another try to clearml-session and I am blocked at the current error shown when the CLI try to establish the tunneling: Starting SSH tunnel W...
2 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi, in a subproject, would it be possible to hide the parent project if it is empty?
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi guys, since I am done with implementing the AWS autoscaler, I would like to share some pain points that I encountered in the process with the hope that th...
aws
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
2 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, in the "Choose compared experiments" view of the WebUI, would it be possible to add a toggle to include archived experiments in the results of the search...
2 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
2 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hey guys, I am setting up a new machine with two rtx 3070 GPUs where I created two agents (one for each GPU). On both agents, my experiments fail with error:...
4 years ago
0 Votes
9 Answers
1K Views
0 Votes 9 Answers 1K Views
Hi, I want to upgrade clearml server from 1.1 to 1.2 (self hosted). I have the following setup: /dev/nvme0n1p1 30G 21G 8.9G 70% / <- This is where /opt/clear...
2 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hi guys, any plan to integrate the https://github.com/allegroai/trains-agent/blob/master/examples/dynamic_cloud_cluster.ipynb in trains-server? The code ther...
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi, I recently updated my clearml to 1.1.2 and a code that was working before now behaves completely differently: I am using the following to log debug sampl...
3 years ago
0 Votes
26 Answers
1K Views
0 Votes 26 Answers 1K Views
Hi, I attached an IAM role to an ec2 instance to grant access to an s3 bucket. The ec2 instance is running a clearml-agent (v1.1.0). I didn’t specify any key...
aws
3 years ago
0 Votes
8 Answers
967 Views
0 Votes 8 Answers 967 Views
3 years ago
0 Votes
1 Answers
997 Views
0 Votes 1 Answers 997 Views
Hi, would it be possible to parse torch requirement when it’s part of the extras_require dict? In my code, I have the following: train_task._update_requireme...
3 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Is there an option to make trains-agent create experiment virtualenvs with --system-site-packages parameter?
4 years ago
0 Votes
1 Answers
998 Views
0 Votes 1 Answers 998 Views
Hi, I have a clearml-agent (1.1.2) in a g4dn.4xlarge AWS instance (with one T4 GPU), that reports agent.cuda_version = 0 agent.cudnn_version = 0and does not ...
2 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi again, my clearml api-server is having a memory leak. Each time I restart it, its ram consumption grows until getting OOM, is not killed and make the ec2 ...
3 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
one year ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
aws
3 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hi, I have a local package that I use to train my models. To start training, I have a script that calls task._update_requirements([".", "torch==1.11.0"]) . I...
2 years ago
Show more results questions
0 Hi, I Recently Updated Clearml-Server To 1.7 And I Am Getting A Lot Of The Following Errors Since Today On Any Experiment (I Didn'T Had This Error Before):

This is the mapping of the faulty index:
` {
"events-plot-d1bd92a3b039400cbafc60a7a5b1e52b_new" : {
"mappings" : {
"dynamic" : "strict",
"properties" : {
"@timestamp" : {
"type" : "date"
},
"iter" : {
"type" : "long"
},
"metric" : {
"type" : "keyword"
},
"plot_data" : {
"type" : "binary"
},
"plot_len" : {
"type" : "long"
},
"plot_str" : {
...

2 years ago
0 Hi

Awesome! (Broken link in migration guide, step 3: https://allegro.ai/docs/deploying_trains/trains_server_es7_migration/ )

4 years ago
0 Hi

MagnificentSeaurchin79 You could also just fork the tensorflow repo, make changes in a specific branch and specify your forked repo with your custom branch in the install_requires of your setup.py

3 years ago
0 Hi Guys, Any Plan To Integrate The

Both ^^, I already adapted the code for GCP and I was planning to adapt to Azure now

4 years ago
2 years ago
2 years ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

And I can verify that ~/trains.conf exists in the su home folder

4 years ago
0 Hi, I Have Another Problem

thanks, I will do that

4 years ago
4 years ago
0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

Also maybe we are not on the same page - by clean up, I mean kill a detached subprocess on the machine executing the agent

4 years ago
0 Hi Guys, Is A Task Updating Its Status To 'Complete' Before Finishing To Upload Its Artifacts/Metrics In The Background?

I want to make sure that an agent did finish uploading its artifacts before marking itself as complete, so that the controller does not try to access these artifacts while they are not available

4 years ago
0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

no, one worker (trains-agent-1) "forget from time to time" the current experiment he is running and picks another experiment on top of the one he is currently running

4 years ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

I will probably just use everywhere an absolute path to be robust against different machine user accounts: /home/user/trains.conf

4 years ago
0 Hi, I Would Like To Follow-Up In This

Hi AppetizingMouse58 , I sent you the files in PM 🙂

2 years ago
0 Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

Would adding a ILM (index lifecycle management) be an appropriate solution?

3 years ago
0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed

4 years ago
0 Hi, I Have A Clearml-Agent (1.1.2) In A G4Dn.4Xlarge Aws Instance (With One T4 Gpu), That Reports

Nevermind, nvidia-smi command fails in that instance, the problem lies somewhere else

2 years ago
0 Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

I think waiting for the apt locks to be released with something like this would work
startup_bash_script = [ "#!/bin/bash", "while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done", "sudo apt-get update", ...Weirdly this throws an error in the autoscaler:
` Spinning new instance type=v100_spot
Error: Failed to start new instance, unexpected '{' in field...

3 years ago
0 Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

I guess I can have a workaround by passing the pipeline controller task id to the last step, so that the last step can download all the artifacts from the controller task.

2 years ago
0 Hi There, I Used

AgitatedDove14 , my “uncommitted changes” ends with
... if __name__ == "__main__": task = clearml.Task.get_task(clearml.config.get_remote_task_id()) task.connect(config) run() from clearml import Task Task.init()

2 years ago
0 Hi There

So in my minimal reproducable example, it does work 🤣 very frustrating, I will continue searching for that nasty bug

4 years ago
Show more results compactanswers