JitteryCoyote63

214 Questions, 1021 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

979 × Eureka!

Questions 214
Answers 1021

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, I Recently Updated My Clearml To 1.1.2 And A Code That Was Working Before Now Behaves Completely Differently: I Am Using The Following To Log Debug Samples:

Hi, I recently updated my clearml to 1.1.2 and a code that was working before now behaves completely differently: I am using the following to log debug sampl...

clearml

3 years ago

0 Votes

19 Answers

1K Views

0 Votes 19 Answers 1K Views

I Guess One Experiment Is Running Backwards In Time

I guess one experiment is running backwards in time 😄

clearml

2 years ago

0 Votes

8 Answers

979 Views

0 Votes 8 Answers 979 Views

Hi, Is It Possible To Pass Temporary Iam Role To The Web App Could Access?

Hi, is it possible to pass temporary IAM role to the web app could access?

clearml

3 years ago

0 Votes

15 Answers

1K Views

0 Votes 15 Answers 1K Views

Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

Hi, how can I get the logs from the pytorch ignite early stopping handler to be logged in clearml?

pytorch

3 years ago

0 Votes

28 Answers

1K Views

0 Votes 28 Answers 1K Views

Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Hi, I am trying to use omegaconf with task.connect_configuration and I get the following error: >>> OmegaConf.create(task.connect_configuration(config_dict))...

clearml

2 years ago

0 Votes

3 Answers

941 Views

0 Votes 3 Answers 941 Views

Hey There, I See That In The Autoscaler Configuration, The

Hey there, I see that in the autoscaler configuration, the queues param accept dictionaries with values of type list of lists (see eg below.) What does it me...

mlops

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hey There, Would It Be Possible To Make Clearml-Agents Support Both Docker Mode And Venv Mode At The Same Time? Ie. Not Requiring To Be Restarted To Switch The Mode. The Mode Should Be Define On The Task Level: I Start An Experiment And Define Whether It

Hey there, Would it be possible to make clearml-agents support both docker mode and venv mode at the same time? Ie. not requiring to be restarted to switch t...

clearml

2 years ago

0 Votes

2 Answers

949 Views

0 Votes 2 Answers 949 Views

Hi Guys; Another Idea: Would Be Very Cool To Have A Mattermost Alert (Monitor Task), Just Like The One For Slack. Have A Nice Week-End All

Hi guys; another idea: would be very cool to have a mattermost alert (monitor task), just like the one for Slack. Have a nice week-end all 👋

clearml

3 years ago

0 Votes

10 Answers

886 Views

0 Votes 10 Answers 886 Views

Hi, Just Want To Report A Small Bug In The Clearml Dashboard: After Queuing An Experiment, If I Change The Experiment Queue, Then Go Back To The Experiment Info Tab, The Queue Property Still Shows The Previous Queue

Hi, just want to report a small bug in the clearml dashboard: after queuing an experiment, if I change the experiment queue, then go back to the experiment I...

clearml

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

The “Manage Queue” Option In The Right Tab On A Queued Experiment Is Broken In V1.0 (It Does Nothing)

The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi There, Any Plan/Benefit To Support Virtualenv= 20 ?

Hi there, any plan/benefit to support virtualenv= 20 ?

clearml

4 years ago

0 Votes

2 Answers

946 Views

0 Votes 2 Answers 946 Views

Hi, In The Aws Autoscaler, I Am Getting The Following Warning:

Hi, in the AWS AutoScaler, I am getting the following warning: Warning! exception occurred: APIError: code 400/1004: Worker is not registered: worker=aws:A10...

clearml

3 years ago

0 Votes

26 Answers

1K Views

0 Votes 26 Answers 1K Views

Hi, I Would Like To Follow-Up In This

Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...

aws mlops

2 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi There, Maybe This Was Already Asked But I Don'T Remember: Would It Be Possible To Have The Clearml-Agent Switch Between Docker Mode And Virtualenv Mode At Runtime, Depending On The Experiment

Hi there, maybe this was already asked but I don't remember: Would it be possible to have the clearml-agent switch between docker mode and virtualenv mode at...

clearml

one year ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi Again, Is There A Way To Pass Secrets As Parameters Of A Task? I Have An Experiment That Requires Connecting To A Database, And I Need To Be Able To Pass The Creds As Task Params (Or In Another Way, I Don'T Know Yet). But I Don'T Want To Expose My Cred

Hi again, is there a way to pass secrets as parameters of a task? I have an experiment that requires connecting to a database, and I need to be able to pass ...

clearml

3 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hey, I Moved My Trains-Server To Another Machine, Zipping The /Opt/Trains/Data Folder As Described In The Docs

Hey, I moved my trains-server to another machine, zipping the /opt/trains/data folder as described in the docs https://allegro.ai/docs/deploying_trains/train...

mlops

4 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Another Strange Behavior Of The Python Sdk Cli: After Executing Python My_Task.Py, Where My_Task.Py Creates And Send To The Queue An Experiment, The Command Returns But After Some Time Some Messages Are Printed In The Console, Such As

Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...

clearml

3 years ago

0 Votes

4 Answers

978 Views

0 Votes 4 Answers 978 Views

Hey There, Is There A Way To Access The Trains Configuration Programmatically At Runtime In A Task (The Configuration That Is Dumped By The Agent In The Logs Before Executing A Task)

Hey there, is there a way to access the trains configuration programmatically at runtime in a task (the configuration that is dumped by the agent in the logs...

mlops

4 years ago

0 Votes

5 Answers

919 Views

0 Votes 5 Answers 919 Views

Hey Again

Hey again 😁 I am migrating my trains-server to AWS and I would like now to have secure accounts (with password). But I don't want to loose the current users...

clearml

4 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hello There! I Have A Question Regarding The Web Ui, On The Project Page: I Have The Following Use Case: I Need To Add Two Custom Columns, Each Reporting One Metric. Currently, This Shows Me The Best (Min/Max) Values Reached By The Model, But Not Necessar

Hello there! I have a question regarding the Web UI, on the project page: I have the following use case: I need to add two custom columns, each reporting one...

clearml

2 years ago

0 Votes

11 Answers

981 Views

0 Votes 11 Answers 981 Views

Hi Guys, Following Up On This

Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, Is It Possible To Start A Clearml-Agent (Not In Docker Mode) On A Machine With A Gpu, But Enforce The Clearml-Agent To Not “See” The Gpu? So That The Experiments Run By This Agent Fail If They Try To Access A Gpu? Like The

Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...

mlops

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi Again, It Seems Like The Aws Autoscaler Is Not Spinning Instances With The Ebs Configuration I Configured. Here Is The Configuration:

Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...

aws mlops

3 years ago

0 Votes

5 Answers

971 Views

0 Votes 5 Answers 971 Views

Hi, Is It Possible To Disable Some Of The System Metrics Monitored? And Also Downsample The Rate Of Logging?

Hi, is it possible to disable some of the system metrics monitored? and also downsample the rate of logging?

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Have A Question About

Hi, I have a question about https://clear.ml/docs/latest/docs/references/sdk/logger#report_scatter3d : Would it be possible to pass a matplotlib figure in 3d...

clearml

2 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Does Trains 0.16 Supports Pip >=20.2?

Does trains 0.16 supports pip >=20.2?

clearml

4 years ago

0 Votes

2 Answers

961 Views

0 Votes 2 Answers 961 Views

Hello, What Is The Default Limit For Global Context ?

Hello, what is the default limit for global context ? https://allegro.ai/docs/storage_manager_storagemanager.html#trains.storage.manager.StorageManager.get_l...

clearml

4 years ago

0 Votes

27 Answers

1K Views

0 Votes 27 Answers 1K Views

Hi, similar to Task.set_offline(True), is there a way to simulate an execution in an agent? (for testing purposes)

clearml

2 years ago

0 Votes

5 Answers

969 Views

0 Votes 5 Answers 969 Views

Hi, It Seems That The

Hi, It seems that the package_manager.pip_version has been removed from the https://allegro.ai/docs/references/trains_ref/#agent , although still being shown...

clearml

4 years ago

0 Votes

3 Answers

963 Views

0 Votes 3 Answers 963 Views

Hi, I Have Several Long Running Experiments Failing With

Hi, I have several long running experiments failing with Process failed, exit code -9 and no other error with clearml 1.0.4 and clearml-agent 1.0.0, what cou...

mlops

3 years ago

Show more results

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

I was rather wondering why clearml was taking space while I configured it to use the /data volume. But as you described AgitatedDove14 it looks like an edge case, so I don’t mind 🙂

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Alright, I had a look in the /tmp/.trains_agent_daemon_outabcdef.txt logs, not many insights from here. For the moment, I simply started a new trains-agent daemon in services mode and I will wait to see what happens.

4 years ago

AgitatedDove14 Is it possible to shut down the server while an experiment is running? I would like to resize the volume and then restart it (should take ~10 mins)

3 years ago

0 Is There An Option To Make Trains-Agent Create Experiment Virtualenvs With

Just found yea, very cool! Thanks!

4 years ago

0 Hi, I Just Updated Clearml Server 1.0 Using

I just checked if something changed in https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_config.html#web-login-authentication

3 years ago

0 Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

we use task.models[] 🙂

3 years ago

Hi DilapidatedDucks58 , I did that already, but I am reusing the same experiment instead of merging two experiments. Step 4 can be seen as:
Update the experiment status to stopped (if it is failed, you won’t be able to re-enqueue it) Set a parameter of that task to point to the latest checkpoint and load it (you can also infer it directy: I simply add a tag to the task resume , and check at runtime if this tag exists, if yes, I fetch the latest checkpoint of the task) Use https://clea...

3 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

So most likely trains was masking the original error, it might be worth investigating to help other users in the future

4 years ago

0 Hey Again

Very cool! Run two train-agent daemons, one per GPU on the same machine, with default Nvidia/CUDA Docker This is close to my use case, I just would like to run these two daemons not with docker, would that be possible? I should just remove the --docker nvidia/cuda param right?

4 years ago

0 Hey Again

trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 1 --queue default &

4 years ago

0 Hi, I Updated To Clearml-Server 1.4.0 And I Am Uncomfortable With The New Table/Detail View, Is There A Way To Disable It And Use The Previous One (On Click -> Open Details)?

I guess I’ll get used to it 😄

2 years ago

0 Hi, I Would Like To Follow-Up In This

SuccessfulKoala55 , This is not the exact corresponding request (I refreshed the tab since then), but the request is an events.get_task_logs , with the following content:

2 years ago

0 Hi, I Would Like To Follow-Up In This

Well actually I do see many errors like that in the browser console:

2 years ago

0 Hi, I Would Like To Follow-Up In This

As a quick fix, can you test with auto refresh (see top right button with the pause sign you have on the video)

That doesn’t work unfortunately

2 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

I will probably just use everywhere an absolute path to be robust against different machine user accounts: /home/user/trains.conf

4 years ago

0 Hi, I Would Like To Follow-Up In This

So the new EventsIterator is responsible for the bug.
Is there a way for me to easily force the WebUI to always use the previous endpoint (v1.7)? I saw in the diff changes v1.1.0 > v1.2.0 that ES version was bumped to 7.16.2. I am using an external ES cluster, and its version is still 7.6.2. Can it be that the incompatibility comes from here? I’ll update the cluster to make sure it’s not the case

2 years ago

0 Hi, I Would Like To Follow-Up In This

Super! I’ll give it a try and keep you updated here, thanks a lot for your efforts 🙏

2 years ago

0 Hi, I Would Like To Follow-Up In This

Hi SuccessfulKoala55 , AgitatedDove14 ,
I updated to 1.4.0 (Web UI shows: WebApp: 1.5.0-186 • Server: 1.5.0-186 • API: 2.18 )
Unfortunately the bug is still there 😞
I don’t see errors in the console anymore though!

I had another look and modified a events.get_task_logs request with a super old timestamp to try to retrieve all logs, this returned me only the few logs already displayed in the console. So I think the problem doesn’t come from the WebUI, but from the...

2 years ago

0 Hi, I Would Like To Follow-Up In This

I am happy if I can be of any help to fix that 😄

2 years ago

0 Hi, I Would Like To Follow-Up In This

Another error that just popped up:

2 years ago

0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

correct, you could also use

Task.create

that creates a Task but does not do any automagic.

Yes, I didn't use it so far because I didn't know what to expect since the doc states:
"Create a new, non-reproducible Task (experiment). This is called a sub-task."

4 years ago

0 Hey, What Is The Exact Difference Between

I tested by installing flask in the default env -> which was installed in the ~/.local/lib/python3.6/site-packages folder. Then I created a venv with flag --system-site-packages . I activated the venv and flask was indeed available

4 years ago

0 Hi There, I Used

So I guess the problem is that the following snippet:
from clearml import Task Task.init()Should be added before the if __name__ == "__main__": ?

2 years ago

0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

I still don't see why you would change the type of the cloned Task, I'm assuming the original Task had the correct type, no?

Because it is easier for me that I create a training task out of the controller task by cloning it (so that parameters are prefilled and I can set the parent task id)

4 years ago

0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Thanks for the hack! The use case is the following: I have a controler that creates training/validation/testing tasks by cloning (so that the parent task id is properly set to the controler). Otherwise I could simply create these tasks with Task.init, but then I would need to set manually the parent task for each one of these tasks, probably with a similar hack, right?

4 years ago

0 Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

No I agree, it’s probably not worth it

2 years ago

0 Are The Env Variables Passed To Trains-Agent Available In Experiments Run By This Trains-Agent?

Awesome, thanks!

4 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

AgitatedDove14 Is it fixed with trains-server 0.15.1?

4 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

awesome, thank you 👍