JitteryCoyote63

215 Questions, 1023 Answers

Active since 10 January 2023

Last activity 3 months ago

Reputation

Badges 1

981 × Eureka!

Questions 215
Answers 1023

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

Hi, I have a question regarding the aws_autoscaler: It usually takes ~hours to get a GPU instance nowadays. I was thinking, it would be much more interesting...

mlops

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hey There Again, I Am Not Sure To Understand What Is The Difference Between Storagemanager And Storagehelper And Which One To Use?

Hey there again, I am not sure to understand what is the difference between StorageManager and StorageHelper and which one to use?

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi, I Encountered A Bug On Clearml-Server 1.0.1: I Tried To Add In A Project Page A Custom Column In +Hyper Parameters > Args > Queue And Got An Error Pop Up With The Following Message:

Hi, I encountered a bug on clearml-server 1.0.1: I tried to add in a project page a custom column in +HYPER PARAMETERS > Args > queue and got an error pop up...

clearml

4 years ago

0 Votes

25 Answers

2K Views

0 Votes 25 Answers 2K Views

Hi, I Have Another Problem

Hi, I have another problem 😅 in one of my agent, one experiment started without torch using GPU. In the logs of the experiment shared below, we can see that...

clearml

5 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hello, Is It Possible For The Clearml-Agent In Docker Mode To Not Pull A Specific Docker Image, But To Build One From The Experiment Repository Using The Dockerfile And .Dockerignore Of The Experiment Repository?

Hello, is it possible for the clearml-agent in docker mode to not pull a specific docker image, but to build one from the experiment repository using the Doc...

clearml

3 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Trains-Elastic | {"Type": "Server", "Timestamp": "2020-12-07T15:19:11,101Z", "Level": "Error", "Component": "O.E.B.Elasticsearchuncaughtexceptionhandler", "Cluster.Name": "Trains", "Node.Name": "Trains", "Message": "Uncaught Exception In Thread [Main]",

trains-elastic | {"type": "server", "timestamp": "2020-12-07T15:19:11,101Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "c...

clearml

4 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi, Is It Possible To Specify The Required Version Of Python For A Task That Is Different From The Python Running The Clearml-Agent? Example: My Clearml-Agent Is Running On Python 3.8 And I Need A Task To Run On Python 3.10. How Can I Do That?

Hi, is it possible to specify the required version of python for a Task that is different from the python running the clearml-agent? Example: my clearml-agen...

clearml

2 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi There, Maybe This Was Already Asked But I Don'T Remember: Would It Be Possible To Have The Clearml-Agent Switch Between Docker Mode And Virtualenv Mode At Runtime, Depending On The Experiment

Hi there, maybe this was already asked but I don't remember: Would it be possible to have the clearml-agent switch between docker mode and virtualenv mode at...

clearml

2 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi, I Would Like To Create Backups Of My Trains-Server Periodically. I Was Thinking About Creating A Service Task Under The Devops Project. The Backup Task Would:

Hi, I would like to create backups of my trains-server periodically. I was thinking about creating a service task under the devops project. The backup task w...

clearml

4 years ago

0 Votes

18 Answers

2K Views

0 Votes 18 Answers 2K Views

Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

Hi, kudos for the 0.15 guys! I am having an issue related to git auth: I have an issue with trains-agent (0.15): it does not use git creds while trying to cl...

mlops

5 years ago

0 Votes

19 Answers

2K Views

0 Votes 19 Answers 2K Views

Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Hi, with clearml-agent 1.5.1, I tried to run an experiment within a docker with image python3:8 and it failed executing the task while trying to call python3...

clearml

2 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, I Face A Strange Behavior From The Clearml-Agent: It’S Running In Services Mode, Not In Docker Mode, Cpu Only. I Want To Execute Two Tasks On This Service Agent. One Works, The Other Always Fails After Being Enqueued And Picked By The Agent With The E

Hi, I face a strange behavior from the clearml-agent: it’s running in services mode, not in docker mode, cpu only. I want to execute two tasks on this servic...

mlops

4 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi! I Have A Question Regarding Performances Of The Clearml-Server: Are The Calls From The Agents Made Asynchronously/In A Non Blocking Separate Thread? Is The Connection To The Clearml-Server Expected To Be A Bottleneck If The Clearml-Server Is Far From

Hi! I have a question regarding performances of the clearml-server: are the calls from the agents made asynchronously/in a non blocking separate thread? is t...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi, I Am Trying To Update The Aws_Autoscaler To The Latest Version On The Master Branch. I Simply Changed The Commit Id In The Experiment And Run It, This Gave Me The Following Error:

Hi, I am trying to update the aws_autoscaler to the latest version on the master branch. I simply changed the commit id in the experiment and run it, this ga...

clearml

4 years ago

0 Votes

17 Answers

2K Views

0 Votes 17 Answers 2K Views

Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

Hi, I have another bug to report for clearml-server 1.2 (self hosted) In the console logs of an experiments, I cannot see the latest logs. Eg my experiment i...

clearml

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

How Can I Filter Out Archived Tasks With Task.Get_Tasks?

How can I filter out archived tasks with Task.get_tasks?

clearml

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

First Link In Hyperparameter Optimization Page Is Broken >

First link in hyperparameter optimization page is broken > https://allegro.ai/docs/examples/examples_hyperparam_opt/

clearml

5 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi, I Have A Clearml-Agent (1.1.2) In A G4Dn.4Xlarge Aws Instance (With One T4 Gpu), That Reports

Hi, I have a clearml-agent (1.1.2) in a g4dn.4xlarge AWS instance (with one T4 GPU), that reports agent.cuda_version = 0 agent.cudnn_version = 0and does not ...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi, I See That There Is A New Parameter In Aws Autoscaler:

Hi, I see that there is a new parameter in aws autoscaler: max_spin_up_time_min - What is the difference with max_idle_time_min ?

aws

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, In The Metric Snapshot Graph, Is It Possible To Scale The Y Axis To

Hi, in the Metric Snapshot graph, is it possible to scale the Y axis to [y_min *0.9, y_max * 1,1] ? currently all my values are flat at the same ~y and it is...

clearml

3 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hello, I Have A Small Question Regarding Ui: Currently, In The Artifacts Section Of A Task, The

Hello, I have a small question regarding UI: Currently, in the artifacts section of a task, the FILE PATH displayed for artifacts stored in s3 are displayed ...

clearml

5 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Is There A Way To Report A Simple Series With X And Y Coords, X And Y Being Two Lists Of Same Length?

Is there a way to report a simple series with X and Y coords, X and Y being two lists of same length?

clearml

5 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Does Trains 0.16 Supports Pip >=20.2?

Does trains 0.16 supports pip >=20.2?

clearml

5 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi, I Would Like To Report Another Bug Introduced With Clearml-Server 1.2.0: In The Comparison Page Of Two Experiments, On The Scalar Tab, With The Graph Layout, When Clicking On The Eye On One Scalar Group To Hide The Related Graphs, The Later Do Disappe

Hi, I would like to report another bug introduced with clearml-server 1.2.0: In the comparison page of two experiments, on the scalar tab, with the graph lay...

clearml

3 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

Hi, some properties of the Task object are not listed in the documentation (such as task.parent, which is not clear whether it is the parent task object itse...

clearml

5 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Another One: What Is The Difference Between Task.Connect() And Task.Set_Parameter?

Another one: What is the difference between Task.connect() and Task.set_parameter?

clearml

5 years ago

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

Hi There, I Have A Bit Of A Problem With Aws Secrets: I Pass Keys As Env Var To Clearml-Agents To Retrieve Data From A Bucket In Us-East-1 But I Use A Bucket To Store Task Artifacts In A Bucket In Eu-Central-1. So When I Pass Aws Keys As Env Vars, The Tas

Hi there, I have a bit of a problem with AWS secrets: I pass keys as env var to clearml-agents to retrieve data from a bucket in us-east-1 but I use a bucket...

mlops

4 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

Hi, I am getting the following errors in the experiments I am currently running: 2021-06-25 17:11:47,911 - clearml.Metrics - ERROR - Action failed <504/0: ev...

clearml

4 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hey, Just Wanted To Mention: In Docs, Task.Get_Parameter Does Not Say:

Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hello, Pytorch 1.8 Was Released, Bringing Amd Wheels With It > Pip Install Torch -F

Hello, Pytorch 1.8 was released, bringing AMD wheels with it > pip install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html Is ClearML s...

clearml

4 years ago

Show more results

0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed

5 years ago

0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

The task requires this service, so the task starts it on the machine - Then I want to make sure the service is closed by the task upon completion/failure/abortion

5 years ago

0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

Ha I see, it is not supported by the autoscaler > https://github.com/allegroai/clearml/blob/282513ac33096197f82e8f5ed654948d97584c35/trains/automation/aws_auto_scaler.py#L120-L125

4 years ago

0 Congrats On The Clearml-Serving 0.9.0 Release! I’Ll Try It For Sure!

This is HUGE 🔥 🚀 🎉

3 years ago

0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

I ended up dropping omegaconf altogether

3 years ago

0 Got Some Errors While Running Migration Script From Es5 To Es7:

as for disk space: I have 21Gb available (8Gb used), /opt/trains/data folder is about 600Mo

5 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

I followed https://github.com/NVIDIA/nvidia-docker/issues/1034#issuecomment-520282450 and now it seems to be setting up properly

4 years ago

0 Hi, Although

btw SuccessfulKoala55 the parameter is not documented in https://allegro.ai/clearml/docs/docs/references/clearml_ref.html#sdk-development-worker

4 years ago

0 I Am Wondering Is It Possible To Schedule A Task To Run At Certain Time In Periodic Fashion Aka. Cron Style... Thinking Of Having A Monitoring Task To Be Run Routinely ... I Could Use A Cron On One Of The Server But Prefer To Run It On Trains As Then I Am

Hi PompousParrot44 , you could have a Controller task running in the services queue that periodically schedules the task you want to run

5 years ago

0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

I'll try with that; https://github.com/allegroai/clearml/compare/master...H4dr1en:add-aws-params

4 years ago

0 Hi, I Have Another Problem

btw shoulnd't it be CUDA_VERSION=11.0 ?

5 years ago

0 Hey There, Since Which Version, Clearml Stops Connecting To The Demo Server By Default?

Thanks SuccessfulKoala55 ! So CLEARML_NO_DEFAULT_SERVER=1 by default, right?

4 years ago

0 Another Strange Behavior Of The Python Sdk Cli: After Executing Python My_Task.Py, Where My_Task.Py Creates And Send To The Queue An Experiment, The Command Returns But After Some Time Some Messages Are Printed In The Console, Such As

yes, exactly: I run python my_script.py , the script executes, creates the task, calls task.remote_execute(exit_process=True) and returns to bash. Then, in the bash console, after some time, I see some messages being logged from clearml

4 years ago

0 Hi, Although

Add carriage return flush support using the sdk.development.worker.console_cr_flush_period configuration setting (GitHub trains Issue 181)

4 years ago

0 Hi, I Encounter A Weird Behavior: I Have A Task A That Schedules A Task B. Task B Is Executed On An Agent, But With An Old Commit

The task I cloned from is not the one I though

5 years ago

0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

, causing it to unregister from the server (and thus not remain there).

Do you mean that the agent actively notifies the server that it is going down? or the server infers that the agent is down after a timeout?

2 years ago

0 Hi, I Deleted Some Archived Experiments In Clearml Server 1.0 And The Popup In The Dashboard Showed “The Following Artifacts Were Not Deleted”, With A List Of Files That Are Under

SuccessfulKoala55 They do have the right filepath, eg:
https://***.com:8081/my-project-name/experiment_name.b1fd9df5f4d7488f96d928e9a3ab7ad4/metrics/metric_name/predictions/sample_00000001.png

4 years ago

0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

For some reason the configuration object gets updated at runtime to
resource_configurations = null queues = null extra_trains_conf = "" extra_vm_bash_script = ""

4 years ago

0 Hi Everyone, Now I Am Evaluating Clearml. I Have A Question About How To Handle Datasets. Does Clearml Provide Any Function To Manage Datasets? Or Do We Need To Manage Them By Ourselves? In Our Usecase, We Update Datasets Little By Little Over Days Or W

I will let the team answer you on that one 🙂

4 years ago

0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

ProxyDictPostWrite._to_dict() will recursively convert to dict and OmegaConf will not complain then

3 years ago

0 Hi There,

I think that somehow somewhere a reference to the figure is still living, so plt.close("all") and gc cannot free the figure and it ends up accumulating. I don't know where yet

2 years ago

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

/data/shared/miniconda3/bin/python /data/shared/miniconda3/bin/clearml-agent daemon --services-mode --detached --queue services --create-queue --docker ubuntu:18.04 --cpu-only

4 years ago

0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

That gave me

Running in Docker mode (v19.03 and above) - using default docker image: nvidia/cuda running python3

Building Task 94jfk2479851047c18f1fa60c1364b871 inside docker: ubuntu:18.04

Starting docker build
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled

5 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Probably 6. I think because of some reason, it did not go back to main trains-agent. Nevertheless I am not sure, because a second task could start. It could also be that the second was aborted for some reason while installing task requirements (not system requirements, so executing the trains-agent setup within the docker container) and therefore again it couldn't go back to main trains-agent. But ps -aux shows that the trains-agent is stuck running the first experiment, not the second...

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

So most likely trains was masking the original error, it might be worth investigating to help other users in the future

5 years ago

0 Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

Thanks! I would like to use this opportunity to split the indices into multiple shards, as explained here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-split-index.html#indices-split-index

4 years ago

0 Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

we use task.models[] 🙂

4 years ago

0 Hi There, I Used

AgitatedDove14 So I’ll just replace task = clearml.Task.get_task(clearml.config.get_remote_task_id()) with Task.init() and wait for your fix 🙂

3 years ago

0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

` resource_configurations {
A100 {
instance_type = "p3.2xlarge"
is_spot = false
availability_zone = "us-east-1b"
ami_id = "ami-04c0416d6bd8e4b1f"
ebs_device_name = "/dev/xvda"
ebs_volume_size = 100
ebs_volume_type = "gp3"
}
}

queues {
aws_a100 = [["A100", 15]]
}

extra_trains_conf = """
agent.package_manager.system_site_packages = true
agent.package_manager.pip_version = "==20.2.3"
"""

extra_vm_bash_script = """

sudo apt-get install -y libsm6 libxext6 libx...

4 years ago

0 I Guess One Experiment Is Running Backwards In Time

Yes I am 🙂

3 years ago

Show more results