Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...
4 years ago
0 Votes
23 Answers
2K Views
0 Votes 23 Answers 2K Views
Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hey there ๐Ÿ™‚ Still my journey to deploy the aws-autoscaler with spot instances, I have another question: I would like to limit the amount of time spent setti...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, where can I find the logs of trains-agent by default?
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, is it possible to pass environment variables to agents created by the AWS AutoScaler service?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
โš ๏ธ Hi there, I recently updated clearml server to 1.7.0, and found the following critical regression: When I reset an experiment, it is actually deleted ๐Ÿ˜ต ,...
2 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hey there, since which version, clearml stops connecting to the demo server by default?
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi guys, I got a very unexpected error today on in one of my agents: ... Collecting tqdm Using cached tqdm-4.48.2-py2.py3-none-any.whl (68 kB) Processing /ro...
5 years ago
0 Votes
23 Answers
2K Views
0 Votes 23 Answers 2K Views
Hi, I would like to bring awareness on this issue , this impacts my work as I cannot install the older version of torch (1.11.0)
2 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hi, I encountered a bug on clearml-server 1.0.1: I tried to add in a project page a custom column in +HYPER PARAMETERS > Args > queue and got an error pop up...
4 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hi all, Would it be possible to make the aws autoscaler log each scale in/out operation in the console to help debugging/understanding the course of events?
4 years ago
0 Votes
25 Answers
2K Views
0 Votes 25 Answers 2K Views
Hi, I have another problem ๐Ÿ˜… in one of my agent, one experiment started without torch using GPU. In the logs of the experiment shared below, we can see that...
5 years ago
0 Votes
16 Answers
2K Views
0 Votes 16 Answers 2K Views
Got some errors while running migration script from ES5 to ES7: 2020-08-11 15:21:50,130 Running on: Linux 2020-08-11 15:21:50,227 Docker allocated memory: 16...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi there, I would like to report a bug with the resizing of the columns in the projects view: it doesn’t work as expected. Please look at the behavior of the...
4 years ago
0 Votes
26 Answers
2K Views
0 Votes 26 Answers 2K Views
Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, in the Metric Snapshot section of the Overview tab of a project page, would it be possible to: Show running experiments Have the legend clickable, to hid...
3 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Small error in doc: https://allegro.ai/docs/references/trains_agent_ref/#daemon The detach parameter is shown in the command as --detached while it is listed...
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Could you please explain a bit more how trains adapt the torch version depending on the installed cuda version? Here is my setup: cuda 102 installed and corr...
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
4 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)
5 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hi Guys, I had several times now the following errors poping in agents while executing a task: trains_agent: ERROR: Failed applying git diff: I attached the ...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Another one: What is the difference between Task.connect() and Task.set_parameter?
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
aws
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
4 years ago
Show more results questions
0 Hi There, I Have A Problem With Pyjwt: I Am Using

so most likely one hard requirement installs the version 2 of pyjwt while setting up the experiment

4 years ago
0 Hi, In A Subproject, Would It Be Possible To Hide The Parent Project If It Is Empty?

I mean, inside a parent, do not show the project [parent] if there is nothing inside

3 years ago
3 years ago
0 Hi, It Seems That The

But according the the example, this syntax should be supported right?

5 years ago
0 I'M Getting A Lot Of Errors When Running Cleanup Service

What is this cleanup service? where is it available?

3 years ago
0 Hi, I Just Updated Clearml Server 1.0 Using

I added the pass_hashed and restarted the server, still get the same problem

4 years ago
0 Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

so what worked for me was the following startup userscript:
` #!/bin/bash
sleep 120
while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done
sudo apt-get update
while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done
sudo apt-get install -y python3-dev python3-pip gcc git build-essential...

4 years ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

So the problem comes when I do
my_task.output_uri = " s3://my-bucket , trains in the background checks if it has access to this bucket and it is not able to find/ read the creds

5 years ago
0 Hey There, Since A Bit I Often Find Experiments Being Stuck While Training A Model. It Seems To Happen Randomly And I Could Not Find A Reproducible Scenario So Far, But It Happens Often Enough To Be Annoying (I'D Say 1 Out Of 5 Experiments). The Symptoms

Hi AgitatedDove14 , sorry somehow this message got lost ๐Ÿ˜„
clearml version is the latest at the time, 1.7.1 Yes, I always see the "model uploaded completed" for such stuck tasks I am using python 3.8.10

3 years ago
0 Hi, I Just Updated Clearml Server 1.0 Using

This is what I get, when I am connected and when I am logged out (by clearing cache/cookies)

4 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

AgitatedDove14 According to the dependency order you shared, the original message of this thread isn't solved: the agent mentionned used output from nvcc (2) before checking the nvidia driver version (1)

4 years ago
0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

Yes it would be very valuable to be able to tweak that param, currently it's quite annoying because it's set to 30 mins, so when a worker is killed by the autoscaler, I have to wait 30 mins before the autoscaler spins up a new machine because the autoscaler thinks there is already enough agents available, while in reality the agent is down

2 years ago
0 Hi, I Would Like To Use Pytorch3D==0.5.0 With Torch==1.9.1 On Cuda Version 110, Locally It Works, But The Clearml Agent Fails Setting Up The Environment With The Following Error:

AgitatedDove14 Yes that might work, also the first one (with conda) might work as well, I will give it a try, thanks!

4 years ago
0 Hey Guys, I Am Setting Up A New Machine With Two Rtx 3070 Gpus Where I Created Two Agents (One For Each Gpu). On Both Agents, My Experiments Fail With Error:

Hi AgitatedDove14 , coming by after a few experiments this morning:
Indeed torch 1.3.1 does not support cuda, I tried with 1.7.0 and it worked, BUT trains was not able to pick the right wheel when I updated the torch req from 1.3.1 to 1.7.0: It downloaded wheel for cuda version 101. But in the experiment log, the agent correctly reported the cuda version (111). I then replaced the torch==1.7.0 with the direct https link to the torch wheel for cuda 110, and that worked (I also tried specifyin...

4 years ago
0 Hi, I Would Like To Report Something Else Weird In The Clearml-Agent 1.5.1 Running In Docker Mode: In The Logs, When It Dumps Its Config, It Writes:

Ok yes, I get it, this info is also available at the very beginning of the logs, where the agent logs the full docker run command, this docker_cmd is a shorter version?

2 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

thanks for clarifying! Maybe this could be clarified in the agent logs of the experiments with something like the following?
agent.cuda_driver_version = ... agent.cuda_runtime_version = ...

4 years ago
0 Hi There! Is There An Easy Way To Retrieve The Site-Package Directory That Was Created By An Agent From Inside A Task? Eg.

Now I'm curious, what did you end up doing ?

in my repo I maintain a bash script to setup a separate python env. then in my task I spawn a subprocess and I don't pass the env variables, so that the subprocess properly picks up the separate python env

2 years ago
0 Hi, I Just Updated Clearml Server 1.0 Using

The only thing that changed is the new auth.fixed_users.pass_hashed field, that I donโ€™t have in my config file

4 years ago
0 Hi, How Does

Ping CostlyOstrich36 AgitatedDove14 SuccessfulKoala55 Just making sure this wasn't missed ๐Ÿ™‚

2 years ago
0 Hi, Together With

Exactly

5 years ago
0 Hey, I Have One Question Regarding The Cleanup_Service Task In The Devops Project: Does It Assume That The Agent In Services Mode Is In The Trains-Server Machine?

Thanks TimelyPenguin76 and AgitatedDove14 ! I would like to delete artifacts/models related to the old archived experiments, but they are stored on s3. Would that be possible?

5 years ago
0 Hi Guys, Any Plan To Integrate The

Both ^^, I already adapted the code for GCP and I was planning to adapt to Azure now

5 years ago
0 Hi, I Would Like To Bring Awareness

This is not the case, I downloaded it and I got a cuda error at runtime

2 years ago
0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

Hi AnxiousSeal95 , I hope you had nice holidays! Thanks for the update! I discovered h2o when looking for ways to deploy dashboards with apps like streamlit. Most likely I will use either streamlit deployed through clearml or h2o as standalone if ClearML won't support deploying apps (which is totally fine, no offense there ๐Ÿ™‚ )

4 years ago
0 Hey There, I Would Like To Increase The

now how to adapt to do it from extra_vm_bash_script ?

4 years ago
0 Hi There

btw task._get_task_property('hyperparams') also gives me ValueError: Task has no hyperparams section defined

5 years ago
Show more results compactanswers