PricklyRaven28

19 Questions, 110 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

108 × Eureka!

Questions 19
Answers 110

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey :slightly_smiling_face: I’m also using the AWS autoscaler with pipelines, and i can’t seem to make the intermediate steps output their results to S3 instead of the fileserver. is this possible?

Hey 🙂 I’m also using the AWS autoscaler with pipelines, and i can’t seem to make the intermediate steps output their results to S3 instead of the fileserver...

mlops

3 years ago

0 Votes

2 Answers

953 Views

0 Votes 2 Answers 953 Views

Hi 🙂 I have a git folder that i'm running with a local agent in services mode (for slack alerts), but i don't want the agent to clone the repo (for security...

mlops

one year ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hey

Hey 🙂 Working with ClearML pipelines, is there a way to change default directory artifacts to archive with tar instead of zip ? Speaking of the default retu...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi 🙂 How can i disable requirements install when an agent is starting a task? we use docker, so everything should already be installed

mlops

one year ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Also It’S Weird To Me That When Running Pipelines In

Also it’s weird to me that when running pipelines in debugging_pipeline that it will even show in the UI

clearml

2 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

I Think There Is Some Bug With Clearml==1.7.1. I’M Working With Pipelines And After Updating To

i think there is some bug with clearml==1.7.1. I’m working with pipelines and after updating to 1.7.1 all the kwargs are passed as None to the steps, when do...

clearml

3 years ago

0 Votes

18 Answers

2K Views

0 Votes 18 Answers 2K Views

Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

https://clearml.slack.com/archives/CTK20V944/p1713357955958089 Any idea about this?

clearml

one year ago

0 Votes

12 Answers

2K Views

0 Votes 12 Answers 2K Views

Hey, Is There A Way To Temporarily Turn Off Clearml Logging? I’M Using Pipeline And When Developing I Don’T Wand Them To Be Added To The Ui And Spam It. In Wandb There Is A Way To Do

Hey, Is there a way to temporarily turn off clearml logging? I’m using pipeline and when developing i don’t wand them to be added to the UI and spam it. in w...

clearml

2 years ago

0 Votes

2 Answers

752 Views

0 Votes 2 Answers 752 Views

Hey, I'M Working With The Clearml Aws Autoscaler (For Quite Some Time) And Suddenly We Encountered An Issue With Scaling Gpu Machines That Torch Inside The Task Doesn'T Recognize The Gpu Sporadically. If We Restart The Task It Works Just Fine... I Have A

Hey, i'm working with the clearml AWS autoscaler (for quite some time) and suddenly we encountered an issue with scaling GPU machines that torch inside the t...

mlops

6 months ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

(Using A Local Clearml Server)

(using a local clearml server)

clearml

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi :slightly_smiling_face: we have a clearml pipeline that has a step that runs a multi gpu training (with hugginface), we need to invoke it with `accelerate launch` so we use `subprocess.run` inside the step but it hangs when finished. Is this the righ

Hi 🙂 we have a clearml pipeline that has a step that runs a multi gpu training (with hugginface), we need to invoke it with accelerate launch so we use subp...

clearml

one year ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi 🙂 I have an AWS autoscaler (from script) running with EC2 instances, is there a way to configure a queue for a multi-gpu instance where each task gets 1 ...

mlops

one year ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hey Everyone

Hey everyone 🙂 I’m trying to use a ClearML on prem for experiment visualization only, having some issues with multi GPU. It seems that clearml is creating a...

clearml

3 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

If possible, i would like all together prevent the fileserver and write everything to S3 (without needing every user to change their config)

clearml

3 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

There Is A Problem Starting From Clearml 1.7.0 With Python-Fire

There is a problem starting from clearml 1.7.0 with python-fire from clearml import Task import fire def check(first): print(first) if __name__ == '__main__'...

clearml

3 years ago

0 Votes

62 Answers

172K Views

0 Votes 62 Answers 172K Views

Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

Hey, We are using clearml 1.9.0 with transformers 4.25.1… and we started getting errors that do not reproduce in earlier versions (only works in 1.7.2 all 1....

clearml

2 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hey

Hey 🙂 I'm using pipelines with decorators and trying to set a custom docker image to the pipeline itself which doesn't seem to accept a "docker=" argument l...

clearml

2 years ago

0 Votes

4 Answers

875 Views

0 Votes 4 Answers 875 Views

Hi 🙂 I'm working on slack alerts based on the open source example, we added slack mentions like this def get_username_tag(self, task:Task): res = Task._get_...

clearml

one year ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi 🙂 I'm trying to figure out if i have a way to report pipeline-step artifact paths in the main pipeline task. (So i don't need to dig into steps to find t...

clearml

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

Looks like the first issue has been solved 🙂

i think the second one still consists, still checking

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

sounds good 🙂 I’ll soon check if this fixes our issue and update you

2 years ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

We tried both subprocess.run and popen

one year ago

0 There Is A Problem Starting From Clearml 1.7.0 With Python-Fire

looks like it’s working 🙂 tnx

3 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

I tried to work on a reproducible script but then i get errors that my clearml task is already initialized (also doesn’t happen on 1.7.2)

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

Yes it worked 🙂
I loaded my entire clearml.conf in the “extra conf” part of the auto scaler, that worked

3 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

regarding what AgitatedDove14 suggested, i’ll try tomorrow and update

3 years ago

0 Hey Everyone

but that means that there is no way to work with clearml+fastai2+multi gpu

3 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

that makes more sense 🙂
would this work now as a workaround until the version is released?

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

i believe this is because of this code
None

Which initialized the task if clearml is installed… but a task already exists (because of the pipeline), it will replace it

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

Artifacts, nothing is reaching s3

3 years ago

0 Hi

@<1523701070390366208:profile|CostlyOstrich36>

Sorry for the (very) late response.

We use the open source version which isn't part of the ClearML setup.

Anyway, we are using a standalone script but we have it source controlled in git... clearml picks this up and tries to clone the entire repo in the agent. i want to prevent this an just use the script.

11 months ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

I'm getting really weird behavior now, the task seems to report correctly with the patch... but the step doesn't say "uploading" when finished... there is a "return" artifact but it doesn't exist on S3 (our file server configuration)

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

Yes, and the old version only works without the patch.
I see the model on the artifacts tab, but it's not actually uploaded.

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

Nothing that i think is relevant, I'm using latest from master. It might be a new bug on their side, wasn't sure.

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

tried your suggestion, still got to file server…

3 years ago

0 Hey

Ok, tnx (:
We just see that taring and untaring is much faster than zip for big models

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

which part?

3 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

I am currently on vacation, I'll ask my team mates. But if not I'll get to it next week

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

tnx! keep me posted

2 years ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

@<1523701205467926528:profile|AgitatedDove14>
Only got some time to work on it now, i created a small reproducible example.
I also tried to use your suggestion with import accelerate, it also had issues.

overall, when using debug_pipeline it works ok, but both methods don't work without it, i think it has something to do with wrapping accelerate.

Problem with launching through python module (your suggestion), the argparse breaks.
Problem with launching using a new process - rank0 proce...

one year ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

How does this work in the context of a pipeline? One of the steps is a multi gpu training that requires accelerate.

one year ago

0 Hey Everyone

you can get updates on the issue i opened
https://github.com/fastai/fastai/issues/3543

but i think the probably better solution would be to create a custom ClearML callback for fastai with the best practices you think are needed…

Or try to fix the TensorBoardCallback, because for now we can’t use multi gpu because of it 😪

3 years ago

0 There Is A Problem Starting From Clearml 1.7.0 With Python-Fire

i didn’t, prefer not to add temporary workarounds

3 years ago

0 Hey

Hi, yes it's running with autoscaler so it's for sure in docker mode

Are you saying that it should've worked? I got 'docker' attribute doesn't exist error. Maybe it's the version of the clearml server?

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

SmugDolphin23 SuccessfulKoala55 ^

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

@<1523701435869433856:profile|SmugDolphin23>
Hey 🙂
Any update?

We are having more issues with transformers and clearml in their new version.
The step that has transformers 4.25.1 isn’t able to upload artifacts.
If we downgrade transformers==4.21.3 it works

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

when i did this with a normal task it worked wonderfully, with pipeline it didn’t

3 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

This is the next step not being able to find the output of the last step

ValueError: Could not retrieve a local copy of artifact return_object, failed downloading

2 years ago

0 Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):

@<1523701118159294464:profile|ExasperatedCrab78> Sorry only saw this now,
Thanks for checking it!
Glad to see you found the issue, hope you find a way to fix the second one. for now we will continue using the previous version.
Would be glad if you can post when everything is fixed so we can advance our version.

2 years ago

Show more results