JitteryCoyote63

214 Questions, 1021 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

979 × Eureka!

Questions 214
Answers 1021

0 Votes

2 Answers

936 Views

0 Votes 2 Answers 936 Views

Hi Guys; Another Idea: Would Be Very Cool To Have A Mattermost Alert (Monitor Task), Just Like The One For Slack. Have A Nice Week-End All

Hi guys; another idea: would be very cool to have a mattermost alert (monitor task), just like the one for Slack. Have a nice week-end all 👋

clearml

3 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

Hi, some properties of the Task object are not listed in the documentation (such as task.parent, which is not clear whether it is the parent task object itse...

clearml

4 years ago

0 Votes

19 Answers

1K Views

0 Votes 19 Answers 1K Views

Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...

aws mlops

3 years ago

0 Votes

2 Answers

922 Views

0 Votes 2 Answers 922 Views

Hi, Is It Possible To Get An Artifact From A Task And Force Not Using Local Cache? The Task Itself Updated The Artifact In The Meantime And I Cannot Get The Latest Version Of The Artifact. I Saw That

Hi, is it possible to get an artifact from a Task and force not using local cache? The task itself updated the artifact in the meantime and I cannot get the ...

clearml

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi There, Is It Possible To Configure The Clearml-Agent To Run Some Commands Before Running Each Experiment It Launches? Eg.

Hi there, is it possible to configure the clearml-agent to run some commands before running each experiment it launches? Eg. echo "test" > "test.txt" && <-- ...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, In The Aws Autoscaler, Is It Possible To Specify Multiple Regions (Availability_Zone)? I Currently Use Eu-West-1A, And Would Like To Start Using Eu-West-1B And Eu-West-1C. I Tried Specifying A List In Availability_Zone Parameter, But Without Success:

Hi, in the aws autoscaler, is it possible to specify multiple regions (availability_zone)? I currently use eu-west-1a, and would like to start using eu-west-...

aws

3 years ago

0 Votes

19 Answers

1K Views

0 Votes 19 Answers 1K Views

Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Hi, with clearml-agent 1.5.1, I tried to run an experiment within a docker with image python3:8 and it failed executing the task while trying to call python3...

clearml

one year ago

0 Votes

1 Answers

973 Views

0 Votes 1 Answers 973 Views

Hi There, I Moved My Clearml Server From Us To Eu And Now I Am Trying To Setup The Aws Autoscaler With The Different Architecture That I Have Now. So Far I Used An Old Version

Hi there, I moved my ClearML server from US to EU and now I am trying to setup the AWS autoscaler with the different architecture that I have now. So far I u...

mlops

3 years ago

0 Votes

30 Answers

976 Views

0 Votes 30 Answers 976 Views

Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

Could you please explain a bit more how trains adapt the torch version depending on the installed cuda version? Here is my setup: cuda 102 installed and corr...

clearml

4 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi Folks, Is It Possible To Use An Aws P3 Instance (Which As Several Gpus) With One Agent Per Gpu, All Controlled Through Clearml Aws Autoscheduler? So Clearml Aws Autoscheduler Would Know In Advance How Much Agents To Start In The Instances (Can Be An Op

Hi folks, Is it possible to use an aws p3 instance (which as several GPUs) with one agent per GPU, all controlled through ClearML AWS AutoScheduler? So Clear...

aws mlops

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hello, Is It Possible For The Clearml-Agent In Docker Mode To Not Pull A Specific Docker Image, But To Build One From The Experiment Repository Using The Dockerfile And .Dockerignore Of The Experiment Repository?

Hello, is it possible for the clearml-agent in docker mode to not pull a specific docker image, but to build one from the experiment repository using the Doc...

clearml

2 years ago

0 Votes

8 Answers

935 Views

0 Votes 8 Answers 935 Views

Hi! I Have A Question Regarding Performances Of The Clearml-Server: Are The Calls From The Agents Made Asynchronously/In A Non Blocking Separate Thread? Is The Connection To The Clearml-Server Expected To Be A Bottleneck If The Clearml-Server Is Far From

Hi! I have a question regarding performances of the clearml-server: are the calls from the agents made asynchronously/in a non blocking separate thread? is t...

clearml

3 years ago

0 Votes

12 Answers

925 Views

0 Votes 12 Answers 925 Views

Hey, Would It Possible To Add An Option To Make

Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi There, Any Plan/Benefit To Support Virtualenv= 20 ?

Hi there, any plan/benefit to support virtualenv= 20 ?

clearml

4 years ago

0 Votes

2 Answers

957 Views

0 Votes 2 Answers 957 Views

Hello, What Is The Default Limit For Global Context ?

Hello, what is the default limit for global context ? https://allegro.ai/docs/storage_manager_storagemanager.html#trains.storage.manager.StorageManager.get_l...

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, Where Can I Find The Logs Of Trains-Agent By Default?

Hi, where can I find the logs of trains-agent by default?

clearml

4 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, Coming Back With The Venv Caching: With The Following Setting:

Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...

mlops

3 years ago

0 Votes

2 Answers

990 Views

0 Votes 2 Answers 990 Views

Are The Env Variables Passed To Trains-Agent Available In Experiments Run By This Trains-Agent?

Are the env variables passed to trains-agent available in experiments run by this trains-agent?

clearml

4 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

Hello, I am getting ValueError: Could not get access credentials for ' s3://my-bucket ' , check configuration file ~/trains.conf but I did specify them in my...

clearml

4 years ago

0 Votes

20 Answers

1K Views

0 Votes 20 Answers 1K Views

Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

Hello, I have an error while installing git dependencies of local package: So far I used task. update _requirements(“[.]“) with my local package referencing ...

clearml

3 years ago

0 Votes

5 Answers

983 Views

0 Votes 5 Answers 983 Views

Hey There, Since Which Version, Clearml Stops Connecting To The Demo Server By Default?

Hey there, since which version, clearml stops connecting to the demo server by default?

clearml

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, In A Subproject, Would It Be Possible To Hide The Parent Project If It Is Empty?

Hi, in a subproject, would it be possible to hide the parent project if it is empty?

clearml

3 years ago

0 Votes

1 Answers

899 Views

0 Votes 1 Answers 899 Views

Hey There

Hey there 🙂 Would in the WebUI, on an experiment CONFIGURATION tab, for a specific parameter, would it be possible not show its value as a single string whe...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi There

Hi there 🙂 Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...

clearml

4 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi There, I Think There Is A Bug With Clearml Sdk V0.17.5Rc2: When Running A Task Locally, The Dashboard Doesnt Not Shows The Task As Finished Once The Task Is Finished

Hi there, I think there is a bug with clearml sdk v0.17.5rc2: when running a task locally, the dashboard doesnt not shows the task as finished once the task ...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hey, Just Wanted To Mention: In Docs, Task.Get_Parameter Does Not Say:

Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...

clearml

4 years ago

0 Votes

4 Answers

954 Views

0 Votes 4 Answers 954 Views

Hi, In The Metric Snapshot Section Of The Overview Tab Of A Project Page, Would It Be Possible To:

Hi, in the Metric Snapshot section of the Overview tab of a project page, would it be possible to: Show running experiments Have the legend clickable, to hid...

clearml

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi, Are The Experiments Logs Stored In S3 Or In The Trains-Server? (When Using S3 As Artifact Storage)

Hi, are the experiments logs stored in s3 or in the trains-server? (When using s3 as artifact storage)

clearml

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

The “Manage Queue” Option In The Right Tab On A Queued Experiment Is Broken In V1.0 (It Does Nothing)

The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)

clearml

3 years ago

0 Votes

7 Answers

950 Views

0 Votes 7 Answers 950 Views

Hi, Is There A Way To Get Some Stats About The Use Of Workers? I Would Like To Know, Over The Past 3 Months:

Hi, is there a way to get some stats about the use of workers? I would like to know, over the past 3 months: Number of training hours per user Number of trai...

clearml

3 years ago

Show more results

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Oops, I spoke to fast, the json is actually not saved in s3

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

yes

3 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

So get_registered_artifacts() only works for dynamic artifacts right? I am looking for a download_artifacts() which allows me to retrieve static artifacts of a Task

4 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

nvm, bug might be from my side. I will open an issue if I find any easy reproducible example

4 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

awesome! Unfortunately, calling artifact["foo"].get() gave me:
Could not retrieve a local copy of artifact foo, failed downloading file:///checkpoints/test_task/test_2.fgjeo3b9f5b44ca193a68011c62841bf/artifacts/foo/foo.json
It tries to get it from the local storage, but the json is stored in s3 (it does exists) and I did create both tasks specifying the correct output_uri (to s3)

4 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Yes, thanks! In my case, I was actually using TrainsSaver from pytorch-ignite with a local path, then I understood looking at the code that under the hood it actually changed the output_uri of the current task, thats why my previous_task.output_uri = " s3://my_bucket " had no effect (it was placed BEFORE the training)

4 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

So previous_task actually ignored the output_uri

4 years ago

0 We Can’T Add Overview To The Subprojects (Btw Thank You So Much For Subprojects, This Is Probably The Best Feature Ever Introduced To Trains/Clearml). Is It Intended? When I Click Overview For The Subproject, It Just Shows An Empty Page Without Any Button

mmmh probably yes, I can’t say for sure (because I don’t remember precisely when I upgraded to 0.17) but it looks like that

3 years ago

For new projects it works 🙂

3 years ago

I have the same problem, but not only with subprojects, but for all the projects, I get this blank overview tab as shown in the screenshot. It only worked for one project, that I created one or two weeks ago under 0.17

3 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

yes what happens in the case of the installation with pip wheels files?

3 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

So when I create a task using `task = Task.init(project_name=config.get("project_name"), task_name=config.get("task_name"), task_type=Task.TaskTypes.training, output_uri=" s3://my-bucket ") locally, the artifact is correctly logged remotely, but when I create the task remotely (from an agent) the artifact is logged locally (in the agent machine, not on s3)

4 years ago

0 Hi, I Deleted All Archived Experiments In A Project And I Just Realized All Experiments Of All Projects Were Deleted (Clearml Server V1.0.0)

To help you debugging this: in the /dashboard endpoint, all projects were still there, but empty (no experiment inside). No experiments archived as well.

3 years ago

0 I Am Wondering Is It Possible To Schedule A Task To Run At Certain Time In Periodic Fashion Aka. Cron Style... Thinking Of Having A Monitoring Task To Be Run Routinely ... I Could Use A Cron On One Of The Server But Prefer To Run It On Trains As Then I Am

Hi PompousParrot44 , you could have a Controller task running in the services queue that periodically schedules the task you want to run

4 years ago

0 Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

AgitatedDove14 In my case I'd rather have it under the "Artifacts" tab because it is a big json file

4 years ago

0 Hi Again, It Seems Like The Aws Autoscaler Is Not Spinning Instances With The Ebs Configuration I Configured. Here Is The Configuration:

I also tried setting ebs_device_name = "/dev/sdf" - didn't work

3 years ago

0 Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

I think waiting for the apt locks to be released with something like this would work
startup_bash_script = [ "#!/bin/bash", "while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done", "sudo apt-get update", ...Weirdly this throws an error in the autoscaler:
` Spinning new instance type=v100_spot
Error: Failed to start new instance, unexpected '{' in field...

3 years ago

0 Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

Yes, actually thats what I am doing, because I have a task C depending on tasks A and B. Since a Task cannot have two parents, I retrieve one task id (task A) as the parent id and the other one (ID of task B) as a hyper-parameter, as you described 👍

4 years ago

Thats how I would do it, maybe guys from allegro-ai can come up with a better approach 👍

4 years ago

0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

The task requires this service, so the task starts it on the machine - Then I want to make sure the service is closed by the task upon completion/failure/abortion

4 years ago

0 Hi, I Restarted My Clearml-Server (1.1.0) And The Login Page Always Redirects Me To The Login Page. I Am Using Fixed Users In Config Files. In The Logs Of The Api Server I Can See:

3 years ago

0 Hi, I Restarted My Clearml-Server (1.1.0) And The Login Page Always Redirects Me To The Login Page. I Am Using Fixed Users In Config Files. In The Logs Of The Api Server I Can See:

3 years ago

0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

as it's also based on pytorch-ignite!

I am not sure to understand, what is the link with pytorch-ignite?

We're in the brainstorming phase of what are the best approaches to integrate, we might pick your brain later on

Awesome, I'd be happy to help!

3 years ago

0 Hey Guys, I Am Setting Up A New Machine With Two Rtx 3070 Gpus Where I Created Two Agents (One For Each Gpu). On Both Agents, My Experiments Fail With Error:

Hi AgitatedDove14 , coming by after a few experiments this morning:
Indeed torch 1.3.1 does not support cuda, I tried with 1.7.0 and it worked, BUT trains was not able to pick the right wheel when I updated the torch req from 1.3.1 to 1.7.0: It downloaded wheel for cuda version 101. But in the experiment log, the agent correctly reported the cuda version (111). I then replaced the torch==1.7.0 with the direct https link to the torch wheel for cuda 110, and that worked (I also tried specifyin...

4 years ago

0 Hi, I Have A Local Package That I Use To Train My Models. To Start Training, I Have A Script That Calls

Hi NonchalantHedgehong19 , thanks for the hint! what should be the content of the requirement file then? Can I specify my local package inside? how?

2 years ago

0 Hey Again

Very cool! Run two train-agent daemons, one per GPU on the same machine, with default Nvidia/CUDA Docker This is close to my use case, I just would like to run these two daemons not with docker, would that be possible? I should just remove the --docker nvidia/cuda param right?

4 years ago

0 Hey Again

trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 1 --queue default &

4 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

Still the same problem 😞

3 years ago

Although task.data.last_iteration is correct when resuming, there is still this doubling effect when logging metrics after resuming 😞

3 years ago

Still investigating, task.data.last_iteration is correct (equal to engine.state["iteration"] ) when I resume the training

3 years ago

Show more results