SillySealion58

6 Questions, 30 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

30 × Eureka!

Questions 6
Answers 30

0 Votes

10 Answers

2K Views

0 Votes 10 Answers 2K Views

Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

Hello! I'm running a task for which I want to log SEVERAL checkpoints of a model. I have a reason to save the checkpoints in different folders locally but th...

clearml

3 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Hello! I’m having trouble with connecting my on-premises S3 storage to ClearML. Could anyone please tell me how to fix it?

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi! I'M Trying To Implement A "Keep N Best Checkpoints" Logic In My Project. (It Was Discussed Here Some Time Ago But I Can'T See The Thread Due To Slack Limitations) In That Thread, A Function

Hi! I'm trying to implement a "keep N best checkpoints" logic in my project. (It was discussed here some time ago but I can't see the thread due to Slack lim...

clearml

2 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hello! I'M Using A

Hello! I'm using a https://github.com/allegroai/clearml/blob/25df5efe74972624671df2ae97a3c629eb0c5322/clearml/backend_interface/task/task.py#L1360 https://gi...

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hello! I Can'T Seem To Be Able To Stop Clearml From Automatically Logging Model Files (Optimizer, Scheduler). It'S A Useful Feature But I'D Like To Have Some Control Over It, So That The Disk Space In My File Storage Isn'T Overused. I'M Using

Hello! I can't seem to be able to stop Clearml from automatically logging model files (optimizer, scheduler). It's a useful feature but I'd like to have some...

clearml

one year ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hey, Guys! Could You Help Me To Find Out If There'S An Adequate Way To Track An Experiment With Clearml If I'M Running It As A Black-Box Bash Script?

Hey, guys! Could you help me to find out if there's an adequate way to track an experiment with Clearml if I'm running it as a black-box bash script?

clearml

one year ago

0 Hello! I Can'T Seem To Be Able To Stop Clearml From Automatically Logging Model Files (Optimizer, Scheduler). It'S A Useful Feature But I'D Like To Have Some Control Over It, So That The Disk Space In My File Storage Isn'T Overused. I'M Using

Hi, Eugen!
Thanks for the reference, I'll check it out

one year ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

It’s a self-hosted one. Its address is s3.kontur.host, port 443

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Did that and still have the same error:
Failed creating storage object Reason: Missing key and secret for S3 storage access ( )

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Probably so, but not sure:( I’ll have to figure it out with our DevOps engineer

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

I was just wondering if there’s some valid example of a clearml.conf containing the correct on-premises s3 settings so that I could use them as a basis?

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

BTW, is it correct to set the files_server in the api section?
files_server: " s3://s3.kontur.host:443/srs-clearml "

3 years ago

0 Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

Hi, Erez!
Thank you for the example, I checked it out. It really creates two models. But the thing is, these two models have different file names here. In my scenario, however, it's more convenient for me to have the same file name and different directories for the models. In this case, all my models get overwritten by the latest logged one (as in my screenshot above).
Fortunately, if I use upload_artifact() instead (which I eventually go with) I manage to achieve what I want (see the s...

3 years ago

0 Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

filename = './models/v1/model.ckpt' torch.save(state_dict, filename) mv1 = OutputModel(name='model_v1', task=task) mv1.update_weights(filename, upload_uri=my_uri) update_model(mynn.multiplier) state_dict = mynn.state_dict() filename = './models/v2/model.ckpt' torch.save(state_dict, filename) mv2 = OutputModel(name='model_v2', task=task) mv2.update_weights(filename, upload_uri=my_uri)

3 years ago

0 Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

Thank you but although I'm actually already using the parameter name mentioned in your response in my code, I can see only one model on the task's page

3 years ago

0 Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

Unfortunately, the other parameters like tags and comment didn't help to separate the models

3 years ago

0 Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

Hi, Erez!
Thank you for your answer! I'll see if it solves the problem

3 years ago

0 Hi! I'M Trying To Implement A "Keep N Best Checkpoints" Logic In My Project. (It Was Discussed Here Some Time Ago But I Can'T See The Thread Due To Slack Limitations) In That Thread, A Function

Thank you! I'll try it out and let you know the result

2 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

clearml 1.3.2
boto3==1.22.7
botocore==1.25.7
I didn’t deploy the server myself but I verified that it works with s3cmd

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Do you mean like an example for minio?

Yeah, but with the output_uri in task initialisation as well. Am I right that in that case it would be like that?
output_uri=' s3://my-minio-host:9000/bucket_name '

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Yeah, it holds. I just sent an extract from the config for it to be concise. Here’s the full version

3 years ago

0 Hello! I'M Using A

Are you saying we should expose

raise_on_errors

it to _delete_artifacts() function itself?

That'd be a great solution, thanks! I'll create a PR shortly

2 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

I assume you have actual values for

key

and

secret

in:

That’s right, I use the same values which work for that bucket with s3cmd

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Finally solved it. Turned out it was an authentication issue. In my case, I had to use values for ACCESS_KEY/SECRET other than those which I used with boto3 client

3 years ago

0 Hello! I'M Running A Task For Which I Want To Log Several Checkpoints Of A Model. I Have A Reason To Save The Checkpoints In Different Folders Locally But Them Having The Same File Name. I Use

SweetBadger76 Could you please verify if that is what you meant. I'm still confused if I'm doing something wrong or everything works as intended and Clearml discriminates models only by the file name.

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

So it’s Ceph (RADOS) Object Gateway in my case

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

If I set it to False I get another error:
Failed creating storage object Reason: Missing key and secret for S3 storage access ( )

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

Well. what’s for sure is that I have the required permissions to write to the bucket, as I manage to upload files into it through s3cmd and boto3

3 years ago

0 Hi! I'M Trying To Implement A "Keep N Best Checkpoints" Logic In My Project. (It Was Discussed Here Some Time Ago But I Can'T See The Thread Due To Slack Limitations) In That Thread, A Function

Tried it, the outcome's still the same though: the artifacts deleted using the task._delete_artifacts() function resurrect on further calls of task.upload_artifact() on new artifacts

2 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

With this variant of clearml.config I’m now getting a new error:
ERROR - Exception encountered while uploading Failed uploading object s3.kontur.host:443/srs-clearml/SpeechLab/ASR/data_logging/test1.1be56a53647646208ffd665908056d49/artifacts/data/valset_2021_02_01_sb_manifest_true_micro.json (405): <?xml version="1.0" encoding="UTF-8"?><Error><Code>MethodNotAllowed</Code><RequestId>tx00000000000000000fc69-0062781afb-eba8e9-default</RequestId><HostId>eba8e9-default-default</HostId></Error>

3 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

I’m also not exactly an expert here, but it must be Ceph if it’s possible to be so

3 years ago

0 Hello! I'M Using A

suggest overwriting them locally?

Yeah, that might be an option but it doesn't have enough flexibility for all my scenarios. E.g. I might need to have different N-numbers for the local and remote (ClearML) storage.

2 years ago

0 Hello! I’M Having Trouble With Connecting My On-Premises S3 Storage To Clearml. Could Anyone Please Tell Me How To Fix It?

the

secure

flag is

false

I played with this setting as well - didn’t make it work

3 years ago

0 Hey, Guys! Could You Help Me To Find Out If There'S An Adequate Way To Track An Experiment With Clearml If I'M Running It As A Black-Box Bash Script?

More precisely, I'm using Llama factory and I'm running train scripts from there it, like python train.py ... , without editing them. Therefore I can't create a Clearml Task inside this process to record the experiment to. Of course I can manually add all the parameters, metrics and artifacts afterwards, but ideally, I'd like to be able to have real-time logs of my Llama-factory-experiment in Clearml. The package has integrations wit...

one year ago

0 Hey, Guys! Could You Help Me To Find Out If There'S An Adequate Way To Track An Experiment With Clearml If I'M Running It As A Black-Box Bash Script?

Has anyone done something similar? How did you manage to track real-time data about the experiment to Clearml?

one year ago

0 Hey, Guys! Could You Help Me To Find Out If There'S An Adequate Way To Track An Experiment With Clearml If I'M Running It As A Black-Box Bash Script?

Hi, Jake!
Thanks for your response! I just managed to solve the problem by running my train CLI command in a subprocess and creating a thread to capture the stdout from this subprocess and send it to a Clearml Task. The solution doesn/t even seem too ugly as I was afraid it would be 😀

one year ago