JitteryCoyote63

215 Questions, 1023 Answers

Active since 10 January 2023

Last activity 3 months ago

Reputation

Badges 1

981 × Eureka!

Questions 215
Answers 1023

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Is There A Way To Report A Simple Series With X And Y Coords, X And Y Being Two Lists Of Same Length?

Is there a way to report a simple series with X and Y coords, X and Y being two lists of same length?

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

(Sorry I Pinned The Message Accidentally

(sorry I pinned the message accidentally 😅 )

clearml

5 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hi, I Would Like To Report Something Else Weird In The Clearml-Agent 1.5.1 Running In Docker Mode: In The Logs, When It Dumps Its Config, It Writes:

Hi, I would like to report something else weird in the clearml-agent 1.5.1 running in docker mode: In the logs, when it dumps its config, it writes: docker_c...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Quick Question: Why Does Clearml-Server 1.15.0 Api-Server Python Package Require Es 8.12.0 But The Docker-Compose References Es 7.17.18?

Quick question: Why does clearml-server 1.15.0 api-server python package require ES 8.12.0 but the docker-compose references ES 7.17.18?

clearml

one year ago

0 Votes

19 Answers

2K Views

0 Votes 19 Answers 2K Views

I Guess One Experiment Is Running Backwards In Time

I guess one experiment is running backwards in time 😄

clearml

3 years ago

0 Votes

29 Answers

2K Views

0 Votes 29 Answers 2K Views

Hi, Although

Hi, although https://github.com/allegroai/clearml/issues/181 is resolved, clearml-agent (0.17.2) still logs tqdm iterations as different lines, is there some...

clearml

4 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi, In The "Choose Compared Experiments" View Of The Webui, Would It Be Possible To Add A Toggle To Include Archived Experiments In The Results Of The Search? Also Add The Task Type Field?

Hi, in the "Choose compared experiments" view of the WebUI, would it be possible to add a toggle to include archived experiments in the results of the search...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hey There, Does Trains Support

Hey there, Does trains support clicks ? (entry points defined with that library)

clearml

5 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Hello, In The Following Context:

Hello, in the following context: controller_task = Task.init(...) # This will clone the parent task, enqueue and wait for finished status data_processing_tas...

clearml

5 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Looks Like Trains-Agent 0.16

Looks like trains-agent 0.16 doesn't support --install-globally documented parameter -> Only available for trains-agent build command. Would it be possible t...

clearml

5 years ago

0 Votes

15 Answers

2K Views

0 Votes 15 Answers 2K Views

Hi, I Restarted My Clearml-Server (1.1.0) And The Login Page Always Redirects Me To The Login Page. I Am Using Fixed Users In Config Files. In The Logs Of The Api Server I Can See:

Hi, I restarted my clearml-server (1.1.0) and the login page always redirects me to the login page. I am using fixed users in config files. In the logs of th...

clearml

4 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi, I Encounter The Following Bug With Clearml 0.17.5Rc2: When I Start A Task Locally And That Task Raises Cuda Out Of Memory, The Command Returns But The Process Is Not Killed, And Therefore The Gpu Ram Is Not Freed

Hi, I encounter the following bug with clearml 0.17.5rc2: When I start a task locally and that task raises cuda out of memory, the command returns but the pr...

clearml

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Hi, one more question: When creating a task with Task.init(), we can specify the task_type . Now when using Task.clone(), we cannot specify the task_type (is...

clearml

5 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi There

Hi there 🙂 Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...

clearml

5 years ago

0 Votes

10 Answers

2K Views

0 Votes 10 Answers 2K Views

Hi Guys, Any Plan To Integrate The

Hi guys, any plan to integrate the https://github.com/allegroai/trains-agent/blob/master/examples/dynamic_cloud_cluster.ipynb in trains-server? The code ther...

clearml

5 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

Hey there, Is it possible for a clearml pipeline step to log a folder instead of numpy/pickle objects? Looking at the docs, monitor_artifacts could be what I...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi, In A Subproject, Would It Be Possible To Hide The Parent Project If It Is Empty?

Hi, in a subproject, would it be possible to hide the parent project if it is empty?

clearml

3 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi, Are The Experiments Logs Stored In S3 Or In The Trains-Server? (When Using S3 As Artifact Storage)

Hi, are the experiments logs stored in s3 or in the trains-server? (When using s3 as artifact storage)

clearml

4 years ago

0 Votes

23 Answers

2K Views

0 Votes 23 Answers 2K Views

Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...

mlops

5 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hi Again, It Seems Like The Aws Autoscaler Is Not Spinning Instances With The Ebs Configuration I Configured. Here Is The Configuration:

Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...

aws mlops

4 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

Hi, is it possible to pass environment variables to agents created by the AWS AutoScaler service?

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi There, I Recently Updated Clearml Server To 1.7.0, And Found The Following

⚠️ Hi there, I recently updated clearml server to 1.7.0, and found the following critical regression: When I reset an experiment, it is actually deleted 😵 ,...

clearml

2 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi, In The Context Of Multi-Gpu Training, Is

Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others

clearml

3 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hey There, Since Which Version, Clearml Stops Connecting To The Demo Server By Default?

Hey there, since which version, clearml stops connecting to the demo server by default?

clearml

4 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

Hi guys, I got a very unexpected error today on in one of my agents: ... Collecting tqdm Using cached tqdm-4.48.2-py2.py3-none-any.whl (68 kB) Processing /ro...

clearml

5 years ago

0 Votes

23 Answers

2K Views

0 Votes 23 Answers 2K Views

Hi, I Would Like To Bring Awareness

Hi, I would like to bring awareness on this issue , this impacts my work as I cannot install the older version of torch (1.11.0)

clearml

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi All, Would It Be Possible To Make The Aws Autoscaler Log Each Scale In/Out Operation In The Console To Help Debugging/Understanding The Course Of Events?

Hi all, Would it be possible to make the aws autoscaler log each scale in/out operation in the console to help debugging/understanding the course of events?

aws mlops

4 years ago

0 Votes

25 Answers

2K Views

0 Votes 25 Answers 2K Views

Hi, I Have Another Problem

Hi, I have another problem 😅 in one of my agent, one experiment started without torch using GPU. In the logs of the experiment shared below, we can see that...

clearml

5 years ago

0 Votes

16 Answers

2K Views

0 Votes 16 Answers 2K Views

Got Some Errors While Running Migration Script From Es5 To Es7:

Got some errors while running migration script from ES5 to ES7: 2020-08-11 15:21:50,130 Running on: Linux 2020-08-11 15:21:50,227 Docker allocated memory: 16...

clearml

5 years ago

0 Votes

26 Answers

2K Views

0 Votes 26 Answers 2K Views

Hi, I Would Like To Follow-Up In This

Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...

aws mlops

3 years ago

Show more results

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

AgitatedDove14 Yes exactly, I tried the fix suggested in the github issue urllib3>=1.25.4 and the ImportError disappeared 🙂

5 years ago

0 Hi Clearml Team Members! Is There Any Progress Made On The Clearml-Serving Repo? I’D Love To Start Using It But I Lack A Straightforward Get Started Example. My Use Case Is The Following:

Hi AgitatedDove14 , that’s super exciting news! 🤩 🚀
Regarding the two outstanding points:
In my case, I’d maintain a client python package that takes care of the pre/post processing of each request, so that I only send the raw data to the inference service and I post process the raw output of the model returned by the inference service. But I understand why it might be desirable for the users to have these steps happening on the server. What is challenging in this context? Defining how t...

3 years ago

0 Trains-Elastic | {"Type": "Server", "Timestamp": "2020-12-07T15:19:11,101Z", "Level": "Error", "Component": "O.E.B.Elasticsearchuncaughtexceptionhandler", "Cluster.Name": "Trains", "Node.Name": "Trains", "Message": "Uncaught Exception In Thread [Main]",

Ok, now I would like to copy from one machine to another via scp, so I copied the whole /opt/trains/data folder, but I got the following errors:

4 years ago

0 Hi Guys, I Had Several Times Now The Following Errors Poping In Agents While Executing A Task:

Could be, but not sure -> from 0.16.2 to 0.16.3

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

now I can do nvcc --version and I get
Cuda compilation tools, release 10.1, V10.1.243

4 years ago

0 Hi, I Have A Long Running Experiment That Was Running On Aws Instance That Got Killed After ~4 Days With The Following Reason:

Thanks! I will investigate further, I am thinking that the AWS instance might have been stuck for an unknown reason (becoming unhealthy)

3 years ago

0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

Thanks a lot, I will play with that!

4 years ago

0 Hey There, Does Trains Support

No worries! I asked more to be informed, I don't have a real use-case behind. This means that you guys internally catch the argparser object somehow right? Because you could also simply use sys argv to find the parameters, right?

5 years ago

0 Hi, If I Am Starting My Training With The Following Command:

Hi AgitatedDove14 , How should we proceed to fix this bug? Should I open an issue in github? Should I try to make a minimal reproducible example? It’s blocking me atm

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

-> seems to run properly now

5 years ago

0 Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

PS: in the new env, I’v set num_replicas: 0, so I’m only talking about primary shards…

4 years ago

0 Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

ha nice thanks 🙏

4 years ago

0 Hi, I Would Like To Follow-Up In This

I am happy if I can be of any help to fix that 😄

3 years ago

0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Yea, the config is not appearing in the webUI anymore with this method 😞

3 years ago

0 Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

No I agree, it’s probably not worth it

3 years ago

0 Hi, Another Bug To Report With The Aws_Auto_Scaler Using 1.1.2:

clearml 1.1.1 works

4 years ago

0 Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

awesome thanks!

4 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

ClearML has a task.set_initial_iteration , I used it as such:
checkpoint = torch.load(checkpoint_fp, map_location="cuda:0") Checkpoint.load_objects(to_load=self.to_save, checkpoint=checkpoint) task.set_initial_iteration(engine.state.iteration)But still the same issue, I am not sure whether I use it correctly and if it’s a bug or not, AgitatedDove14 ? (I am using clearml 1.0.4rc1, clearml-agent 1.0.0)

4 years ago

0 Hi There, I Have A Bit Of A Problem With Aws Secrets: I Pass Keys As Env Var To Clearml-Agents To Retrieve Data From A Bucket In Us-East-1 But I Use A Bucket To Store Task Artifacts In A Bucket In Eu-Central-1. So When I Pass Aws Keys As Env Vars, The Tas

Yes that’s correct - the weird thing is that the error shows the right detected region

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

and the agent says agent.cudnn_version = 0

4 years ago

0 Hi Guys, Following Up On This

And I do that each time I want to create a subtask. This way I am sure to retrieve the task if it already exists

5 years ago

0 Hi Guys, Following Up On This

it should return the task regardless if it is complete or not

5 years ago

0 Quick Question: How Can I Clone A Task And Change The Cloned Task Type? I See No Task.Set_Type() Function

yes

5 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

AgitatedDove14 Is it fixed with trains-server 0.15.1?

5 years ago

0 I Guess One Experiment Is Running Backwards In Time

Is there any channel where we can see when new self hosted server version are published?

3 years ago

0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

See my answer in the issue - I am not using docker

5 years ago

thanks Jake, good to know!

4 years ago

0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

This is consistent: Each time I send a new task on the default queue, if trains-agent-1 has only one task running (the long one), it will pick another one. If I add one more experiment in the queue at that point (trains-agent-1 running two experiments at the same time), that experiment will stay in queue (trains-agent-2 and trains-agent-3 will not pick it because they also are running experiments)

5 years ago

0 Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

on a p3.2xlarge instance

4 years ago

0 Hi There, I Have A Problem With Pyjwt: I Am Using

venv mode

4 years ago

Show more results