JitteryCoyote63

215 Questions, 1023 Answers

Active since 10 January 2023

Last activity 3 months ago

Reputation

Badges 1

981 × Eureka!

Questions 215
Answers 1023

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi, I Am Using The Aws Autoscaler And Getting The Following Error While Trying To Spin Up Spot Instances:

Hi, I am using the aws autoscaler and getting the following error while trying to spin up spot instances: 2021-08-16 17:18:48 Spinning new instance type=v100...

aws mlops

4 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi There, Is It Possible To Configure The Clearml-Agent To Run Some Commands Before Running Each Experiment It Launches? Eg.

Hi there, is it possible to configure the clearml-agent to run some commands before running each experiment it launches? Eg. echo "test" > "test.txt" && <-- ...

clearml

4 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hi, I Have A Long Running Experiment That Was Running On Aws Instance That Got Killed After ~4 Days With The Following Reason:

Hi, I have a long running experiment that was running on AWS instance that got killed after ~4 days with the following reason: STATUS REASON: Forced stop (no...

clearml

3 years ago

0 Votes

10 Answers

2K Views

0 Votes 10 Answers 2K Views

Hey, What Is The Exact Difference Between

Hey, what is the exact difference between agent.package_manager.system_site_packages and trains-agent --install-globally ?

clearml

5 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Congrats On The Clearml-Serving 0.9.0 Release! I’Ll Try It For Sure!

Congrats on the clearml-serving 0.9.0 release! I’ll try it for sure!

clearml

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi, I Have A Configuration File That I Read And Connect To My Training Tasks. I Cannot Use

Hi, I have a configuration file that I read and connect to my training tasks. I cannot use config = task.get_parameters_as_dict()["General"]["param"]["nested...

clearml

3 years ago

0 Votes

20 Answers

2K Views

0 Votes 20 Answers 2K Views

Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

Is it possible to run an agent, listen to the services queue without using docker?

clearml

5 years ago

0 Votes

19 Answers

2K Views

0 Votes 19 Answers 2K Views

Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...

aws mlops

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, In The Metric Snapshot Graph, Is It Possible To Scale The Y Axis To

Hi, in the Metric Snapshot graph, is it possible to scale the Y axis to [y_min *0.9, y_max * 1,1] ? currently all my values are flat at the same ~y and it is...

clearml

3 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

Hi, I am getting the following errors in the experiments I am currently running: 2021-06-25 17:11:47,911 - clearml.Metrics - ERROR - Action failed <504/0: ev...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi, I See That There Is A New Parameter In Aws Autoscaler:

Hi, I see that there is a new parameter in aws autoscaler: max_spin_up_time_min - What is the difference with max_idle_time_min ?

aws

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi Guys; Another Idea: Would Be Very Cool To Have A Mattermost Alert (Monitor Task), Just Like The One For Slack. Have A Nice Week-End All

Hi guys; another idea: would be very cool to have a mattermost alert (monitor task), just like the one for Slack. Have a nice week-end all 👋

clearml

4 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

The “Manage Queue” Option In The Right Tab On A Queued Experiment Is Broken In V1.0 (It Does Nothing)

The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)

clearml

4 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, If I Am Starting My Training With The Following Command:

Hi, if I am starting my training with the following command: python -u -m torch.distributed.launch --nproc_per_node=2 --use_env train.py --config configs/tra...

clearml

3 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hi Guys, Following Up On This

Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hey! Would It Be Possible To Tag The Rc Releases In The Different Repos? So That One Knows What Is Inside?

Hey! Would it be possible to tag the RC releases in the different repos? So that one knows what is inside?

clearml

5 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi, There Is A Small Bug With Auto-Refreshing In The Debug Samples Tab Of The Web Ui: If It Is On, Then It Will Always Force The First Series To Be Displayed, Regardless Of What Has Been Selected By The User. Eg: If I Have Two Metrics,

Hi, there is a small bug with auto-refreshing in the DEBUG SAMPLES Tab of the Web UI: If it is ON, then it will always force the first series to be displayed...

clearml

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi, How Does

Hi, how does agent.enable_git_ask_pass works? I am using the clearml-agent in docker mode and my experiment is stuck at downloading a private dependency: Clo...

mlops

2 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi Again, Is There A Way To Pass Secrets As Parameters Of A Task? I Have An Experiment That Requires Connecting To A Database, And I Need To Be Able To Pass The Creds As Task Params (Or In Another Way, I Don'T Know Yet). But I Don'T Want To Expose My Cred

Hi again, is there a way to pass secrets as parameters of a task? I have an experiment that requires connecting to a database, and I need to be able to pass ...

clearml

4 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi, I Cannot Manage To Start Trains-Server 0.16 With The Docker-Compose File, The Trains-Elastic Container Fails With The Following Error:

Hi, I cannot manage to start trains-server 0.16 with the docker-compose file, the trains-elastic container fails with the following error:

clearml

5 years ago

0 Votes

22 Answers

2K Views

0 Votes 22 Answers 2K Views

Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

Hi, I would like to switch from the elastic-search service in the docker-compose of the clearml-server to an externally managed, scalable elastic-search clus...

clearml

4 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi, I Have A Question About

Hi, I have a question about https://clear.ml/docs/latest/docs/references/sdk/logger#report_scatter3d : Would it be possible to pass a matplotlib figure in 3d...

clearml

3 years ago

0 Votes

30 Answers

3K Views

0 Votes 30 Answers 3K Views

Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

Hello, I am getting ValueError: Could not get access credentials for ' s3://my-bucket ' , check configuration file ~/trains.conf but I did specify them in my...

clearml

5 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hi, I Have An Error With Clearml-Agent 1.5.1 When Importing Tensorflow 2.10

Hi, I have an error with clearml-agent 1.5.1 when importing tensorflow 2.10 from tensorflow.python.client._pywrap_tf_session import * File "/root/.clearml/ve...

tensorflow

2 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

Hi, some properties of the Task object are not listed in the documentation (such as task.parent, which is not clear whether it is the parent task object itse...

clearml

5 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi! I Have A Question Regarding Performances Of The Clearml-Server: Are The Calls From The Agents Made Asynchronously/In A Non Blocking Separate Thread? Is The Connection To The Clearml-Server Expected To Be A Bottleneck If The Clearml-Server Is Far From

Hi! I have a question regarding performances of the clearml-server: are the calls from the agents made asynchronously/in a non blocking separate thread? is t...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hello, Pytorch 1.8 Was Released, Bringing Amd Wheels With It > Pip Install Torch -F

Hello, Pytorch 1.8 was released, bringing AMD wheels with it > pip install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html Is ClearML s...

clearml

4 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Trains-Elastic | {"Type": "Server", "Timestamp": "2020-12-07T15:19:11,101Z", "Level": "Error", "Component": "O.E.B.Elasticsearchuncaughtexceptionhandler", "Cluster.Name": "Trains", "Node.Name": "Trains", "Message": "Uncaught Exception In Thread [Main]",

trains-elastic | {"type": "server", "timestamp": "2020-12-07T15:19:11,101Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "c...

clearml

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi, A Small Bug (Not Really A Bug) In The Autoscaler: I Have P3.2Xlarge Instances That Take A Long Time To Shutdown. With

Hi, a small bug (not really a bug) in the autoscaler: I have p3.2xlarge instances that take a long time to shutdown. With polling_interval_time_min=1 , the a...

mlops

4 years ago

0 Votes

20 Answers

2K Views

0 Votes 20 Answers 2K Views

Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

Hello, I have an error while installing git dependencies of local package: So far I used task. update _requirements(“[.]“) with my local package referencing ...

clearml

4 years ago

Show more results

0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

I am looking for a way to gracefully stop the task (clean up artifacts, shutdown backend service) on the agent

4 years ago

0 Hey, What Is The Exact Difference Between

I tested by installing flask in the default env -> which was installed in the ~/.local/lib/python3.6/site-packages folder. Then I created a venv with flag --system-site-packages . I activated the venv and flask was indeed available

5 years ago

0 Hi Guys For The Aws Auto-Scaler I Need To Access Aws Ssm Or Create .Env File Locally When Using The Init Script. Has Anyone Done This?

Could you please share the stacktrace?

4 years ago

0 Hey, I Moved My Trains-Server To Another Machine, Zipping The /Opt/Trains/Data Folder As Described In The Docs

I was able to fix by applying for a license and registering it

5 years ago

0 Hi, I Face A Strange Behavior From The Clearml-Agent: It’S Running In Services Mode, Not In Docker Mode, Cpu Only. I Want To Execute Two Tasks On This Service Agent. One Works, The Other Always Fails After Being Enqueued And Picked By The Agent With The E

Oof now I cannot start the second controller in the services queue on the same second machine, it fails with
` Processing /tmp/build/80754af9/cffi_1605538068321/work
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/tmp/build/80754af9/cffi_1605538068321/work'
clearml_agent: ERROR: Could not install task requirements!
Command '['/home/machine/.clearml/venvs-builds.1.3/3.6/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r'...

4 years ago

0 Looks Like Trains-Agent 0.16

Thanks, I will create an issue. I am fine with both ways :)

5 years ago

0 Hi, Together With

Which commit corresponds to RC version? So far we tested with latest commit on master (9a7850b23d2b0e1f2098ab051de58ce806143fff)

5 years ago

0 Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

I also would like to avoid any copy of these artifacts on s3 (to avoid double costs, since some folders might be big)

3 years ago

0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

should I try to roll back to clearml-server 1.0.2? I am very anxious now…

4 years ago

Alright SuccessfulKoala55 I was able to make it work by downgrading clearml-agent to 0.17.2

4 years ago

0 Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

AgitatedDove14 yes but I don't see in the docs how to attach it to the logger of the earlystopping handler

4 years ago

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

AgitatedDove14 Is it possible to shut down the server while an experiment is running? I would like to resize the volume and then restart it (should take ~10 mins)

4 years ago

0 Hi, I Updated To Clearml-Server 1.4.0 And I Am Uncomfortable With The New Table/Detail View, Is There A Way To Disable It And Use The Previous One (On Click -> Open Details)?

I guess I’ll get used to it 😄

3 years ago

0 Hi, Is It Possible To Disable Some Of The System Metrics Monitored? And Also Downsample The Rate Of Logging?

Nice, thanks!

4 years ago

The file /tmp/.clearml_agent_out.j7wo7ltp.txt does not exist

4 years ago

0 Hi, I Am Trying To Use The Clearml-Agent In Docker Mode To Run An Experiment, But It Seems To Fail Passing The Clearml.Conf File To The Docker Container:

might be worth documenting 😄

2 years ago

Will it freeze/crash/break/stop the ongoing experiments?

4 years ago

0 Hi There, It Seems Like There Is A Bug With The Visualization Of Debug Samples On The Ui (Server V1.2.0, Self-Hosted): When Clicking On A Debug Sample Then On The Download Button, If The Sample Is Stored In S3, The Download Button Opens A Blank Page With

AgitatedDove14 I am actually considering rolling back to 1.1.0, so 1.3.0 is not really an option for now

3 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

yes what happens in the case of the installation with pip wheels files?

4 years ago

0 Hi, I Am Trying To Use The Clearml-Agent In Docker Mode To Run An Experiment, But It Seems To Fail Passing The Clearml.Conf File To The Docker Container:

When installed with http://get.docker.com , it works

2 years ago

0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

as it's also based on pytorch-ignite!

I am not sure to understand, what is the link with pytorch-ignite?

We're in the brainstorming phase of what are the best approaches to integrate, we might pick your brain later on

Awesome, I'd be happy to help!

4 years ago

0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

AnxiousSeal95 Any update on this topic? I am very excited to see where this can go 🤩

4 years ago

0 Hi, Coming Back With The Venv Caching: With The Following Setting:

Yes, not sure it is connected either actually - To make it work, I had to disable both venv caching and set use_system_packages to off, so that it reinstalls the full env. I remember that we discussed this problem already but I don't remember what was the outcome, I never was able to make it update the private dependencies based on the version. But this is most likely a problem from pip that is not clever enough to parse the tag as a semantic version and check whether the installed package ma...

4 years ago

0 Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

torch==1.7.1 git+ .

4 years ago

0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

Is it because I did not specify --gpu 0 that the agent, by default pulls one experiment per available GPU?

5 years ago

0 Hello, Is It Possible For The Clearml-Agent In Docker Mode To Not Pull A Specific Docker Image, But To Build One From The Experiment Repository Using The Dockerfile And .Dockerignore Of The Experiment Repository?

Ho nice, thanks for pointing this out!

3 years ago

0 Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

5,6 mins exactly

4 years ago

0 Hi, How Can I Change The Project.Default_Output_Destination? I Tried Setting It To None But It Is Not Updated

Thanks AgitatedDove14 ! I created a project with a default output destination to a s3 bucket but I don't have local access to this bucket (only agents have access to it for security reasons). Because of that, I cannot create a task in this project programmatically locally because it tries to access the bucket and fails. And there is no easy way to change the default output location (not in the web UI, not in the sdk)

3 years ago

0 Hi, I Just Updated Clearml Server 1.0 Using

yes exactly

4 years ago

/data/shared/miniconda3/bin/python /data/shared/miniconda3/bin/clearml-agent daemon --services-mode --detached --queue services --create-queue --docker ubuntu:18.04 --cpu-only

4 years ago

Show more results