CostlyOstrich36

0 Questions, 4175 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Answers 4175

0 Hi, I'M Using

Hi @<1523701601770934272:profile|GiganticMole91> , how is the task being stopped in your case? Is it aborted via the web UI or through some other method? Is the task running via the agent?

12 months ago

0 Hello Everyone! I’M Trying To Setup Non Aws S3 To Store Datasets And Get And Error:

Non-AWS S3-like services (e.g. MinIO):

:port/bucket

one year ago

0 Hi Everyone, I'M Setting Up Aws Autoscaler. Is It Possible To Configure

Hi @<1792726992181792768:profile|CloudyWalrus66> , from a short read on the docs it seems simply as a way to spin up many machines with many different configurations with very few actions.

The autoscaler spins up and down regular ec2 instances and spot instances automatically by predetermined templates. Basically making the fleet 'feature' redundant.

Or am I missing something?

7 months ago

0 Hi All, I'M Trying To Set Up Aws Autoscaler To Spin Up Ec2 Instances From Predefined Ami So I Was Able To Set Up The Autoscaler, But I Am Experiencing Some Issues With Spinning Up The Ec2 Instance. Seems Like It Keeps Failing (Spinning Up An Instance, The

Python 2 is no longer supported, I'd suggest finding an AMI that already has python3 built in (Or install it using the init script, not suggested though) and also CUDA enabled to avoid that installation to support cuda images

7 months ago

0 Hi Everyone. I Am Trying To Migrate To Clearml And Need To Have My Old Training Logs Available In Clearml As Well. Unfortunately It Seems I Can'T Simply "Import" My Old Tensorboard Logs Into Clearml To Have It All In One Place. Does Any One Have A Suggest

Hi @<1670964680270548992:profile|SuperiorOctopus47> , you can manually create experiments and log metrics into them via the REST API - None

You basically have some older runs on your tensorboard that you want to import to ClearML?

one year ago

0 Hi, I'M Trying To Use Clearml On Pytorch-Lightning With Multiple Gpus, But It Seems As If The Server Does Not Monitor The Experiment. I Can See No Progress In The Console (Steps Counter Stays On 0) Nor Any Tensorboard Loggings. On A Single Gpu Everything

Hi FancyTurkey50 , how did you run the agent command?

3 years ago

0 Hi Everyone, I'M Setting Up Aws Autoscaler. Is It Possible To Configure

I suggest watching the following videos to get a better understanding:
Agent - None
Autoscaler - None

Also please review agent docs - None

when a task is enqueued when does the autoscaler kicks in?

You're looking for the polling interval parameter as mentioned in the documentation - [None](https://clear.ml/docs/latest/docs/webapp/appl...

7 months ago

0 Hi Everyone. I’M Struggling To Setup Minio Storage. Below Is What I’M Adding In My Credentials And When I Try To Create A New Dataset Using Below Command; I Get Errors: Configs:

Try running the following script

from clearml import Task 
import time

task = Task.init(output_uri="

")

print("start sleep")
time.sleep(20)
print("end sleep")

Please add the logs

one year ago

0 Discovered An Issue With Clearml-Session Where We Have The Agents Running Within A Tailscale Network. When The Clearml Session Is Local On The Same Physical Network, Connections Work Fine. But When We Are On The Virtual Network, They Dont Work Fine

Hi @<1535069219354316800:profile|PerplexedRaccoon19> can you please elaborate on the issue?

one year ago

Also can you provide the configuration of the autoscaler? You can export it through the webUI just make sure to scrape off any credentials

7 months ago

0 Is There A Gcp Driver Similar To

Hi @<1523701083040387072:profile|UnevenDolphin73> , not in the open source

one year ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

I'm not sure, will check 🙂

3 years ago

0 Hey Community, So I Am Facing An Issue Related To Passing Parameters. I Am Getting The Parameters From The Taskscheduler , Taskparameter.

How did you add the parameters to the pipeline? Did you refer to this example?
None

2 years ago

0 Hi

Hi @<1736194540286513152:profile|DeliciousSeaturtle82> , basically all the data is stored in /opt/clearml/data as long as you migrate that to the input of the k8s deployment you should be good.

one year ago

0 Hi Everyone! When I Execute

In general I would suggest running in docker mode 🙂

2 years ago

0 Hello, I Look Up To The Clearml Data Documentation, The Dataset.Add_External_Files Method Seems Like To Perform A Download From The Source Url. What If The Case I Already Have My Dataset In The Output_Uri That I Specify In The Dataset.Create? Btw I Use A

Hi @<1784754456546512896:profile|ConfusedSealion46> , in that case you can simply use add_external_files to the files that are already in your storage. Or am I missing something?

8 months ago

0 Is There A Way Clearml Can Be Stopped From Updating Dependencies When Cloning?

You can specify specific package versions yourself via code
https://clear.ml/docs/latest/docs/references/sdk/task#taskadd_requirements

2 years ago

0 I Am Using Clearml Pro And Pretty Regularly I Will Restart An Experiment And Nothing Will Get Logged To Clearml. It Shows The Experiment Running (For Days) And It'S Running Fine On The Pc But No Scalers Or Debug Samples Are Shown. How Do We Troubleshoot T

@<1719524641879363584:profile|ThankfulClams64> , are logs showing up without issue on the 'problematic' machine?

one year ago

0 Hi All Here

Hi @<1523701523954012160:profile|ShallowCormorant89> , I think you can simply spin down all the containers and copy everything in /opt/clearml/

2 years ago

0 Hello! In My Code I Use A Package That Writes Into Wav Files, Named Soundfile (Import Soundfile As Sf). On 'Conda List' There Are - Soundfile 0.10.3.Post1 Pysoundfile 0.10.3.Post1 Pyhd3Deb0D_0 Conda-Forge Libsndfile

Hi @<1571308003204796416:profile|HollowPeacock58> , do you have a standalone code snippet that reproduces this behavior?

2 years ago

0 I Have The User/Pass Set Up On The

wait.

9 months ago

0 Once I'Ve Created A Dataset And Finalized/Published It, I Can'T Figure Out How To Associate It With A Task Such That It Shows Up In The Task Ui. Dataset.Get() Finds A Dataset, But It Doesn'T Show Up In The Task Ui. I Can Get Inputmodels And Outputmodels T

Hi @<1836213542399774720:profile|ConvincingDragonfly85> , I believe you're looking for the alias parameter of Dataset.get() - None

3 months ago

0 Hi There, Maybe This Was Already Asked But I Don'T Remember: Would It Be Possible To Have The Clearml-Agent Switch Between Docker Mode And Virtualenv Mode At Runtime, Depending On The Experiment

I guess that's a good point but really applicable if your training is CPU intensive. If your training is GPU intensive I guess most of the load goes on the GPU so running over VM (EC2 instances for example) shouldn't have much of a difference but this is worthy of testing.

I found this article talking about performance
https://blog.equinix.com/blog/2022/01/04/3-reasons-why-you-should-consider-running-containers-on-bare-metal/

But it doesn't really say what the difference in performance is...

2 years ago

0 Hi, I Run 'Manually' On My Local Machine With No Errors. Then, I Clone The Completed Task And Enqueue It. I Get To Stage When 'Environment Setup Completed Successfully'. But Right After I Get An Error Related To 'Connect' Method - Task.Connect(Config.Mode

Hi @<1571308003204796416:profile|HollowPeacock58> , do you have a self contained code snippet that reproduces this?

2 years ago

0 Hello All, Another Question I Have: In My Pipeline, My Last Step Is Skipped Instead Of Running. Why? How Can I Unskip It? Just To Be Clear, The Parent Steps Succeed.

Hi @<1533619716533260288:profile|SmallPigeon24> , can you provide a snippet that reproduces this? Do you have some more information? What do you mean skip it?

2 years ago

0 Ist It Possible To Move Artifacts From Local Storage To S3? Or Do I Have To Delete The Old One And Create A New One With A Location In S3?

You would also need to edit the links somehow that are connected to the task

3 years ago

0 Hi All, I'M Running Dl Experiments On Top Of Mmdetection. The Experiments Are Deployed Remotely On A Dedicated Ec2 Instance Through

ResponsiveHedgehong88 you can try mapping out the /tmp/ folder inside the docker outside for later inspection so the data wouldn't be lost. This could give us a better idea of what's happening

3 years ago

0 Hello Everyone, I Am Trying To Deploy A Model In Vllm Model Deployment, I Am Using Tinyllama/Tinyllama-1.1B-Chat-V1.0, It Is Already An Hour It Started Deploying, Still It Is Loading, Will It Take More Time? Or Do I Need To Add Something To The Configurat

Can you provide the full log?

5 months ago

Hi @<1813745484821434368:profile|SuccessfulPigeon84> , what do you see in the log?

5 months ago

0 I Upgraded Clearml To 1.15.0 From 1.0 The Ui Says It’S Running

Hi @<1691258549901987840:profile|PoisedDove36> , did you do all the db migrations during the upgrade or did you go straight to 1.5 form 1.0?

one year ago

Show more results