BattyCrocodile47

36 Questions, 147 Answers

Active since 02 March 2023

Last activity 8 months ago

Reputation

Badges 1

129 × Eureka!

Questions 36
Answers 147

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

More Of Pushing Clearml To It'S Data Engineering Limits

More of pushing ClearML to it's data engineering limits 😅 . Could you use ClearML in a event-driven system? That would be so sick! I'm wondering if we could...

clearml

2 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

Security question: in my journey of running ClearML the "hard way" (self-hosted), one problem I haven't solved is security. Some discussion here...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

If Anyone Wants To Join Remotely, There’S A Remote-First Ai/Ml Hackathon Happening Tomorrow. Some Of The Details Are Out Of Date Right Now—I’Ll Have This Page Updated By The End Of The Night.

If anyone wants to join remotely, there’s a remote-first AI/ML hackathon happening tomorrow. Some of the details are out of date right now—I’ll have this pag...

clearml

one year ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

If I Want To Run Tensorflow (Version 2.10.0 With Python 3.8) With The Aws Autoscaler, Which Ami And Docker Base Image Should I Choose?

If I want to run tensorflow (version 2.10.0 with Python 3.8) with the AWS autoscaler, which AMI and Docker base image should I choose?

tensorflow

2 years ago

0 Votes

0 Answers

603 Views

0 Votes 0 Answers 603 Views

Is Anyone From Clearml Going To Be In Austin For Genai / Mlops World??

Is anyone from ClearML going to be in Austin for GenAI / MLOps world?? https://www.linkedin.com/posts/eric-riddoch_5th-ann-mlops-world-and-generative-ai-wor[...

clearml

11 months ago

0 Votes

34 Answers

184K Views

0 Votes 34 Answers 184K Views

My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

My autoscaled instance fails when running "git clone" on a private repo. I do have the SSH key placed at /root/.ssh/id_rsa on the machine, and when I SSH int...

mlops

2 years ago

Show more results

0 Hey

https://share.descript.com/view/g0SLQTN6kAk

one year ago

0 Hey

Oh I wasn’t aware of that. I don’t think it’d work for this use case though. We’re trying to test the behavior you can see here in this extension https://share.descript.com/view/g0SLQTN6kAk so basically the examples I said in that earlier message

one year ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Haha, that was a total gotcha for me. Yeah, a lot just wasn't even getting run due to the #!/bin/bash part.

Anyway, wow! I finally got the precious console logs you thought to find, here they are:

2023-05-06 00:19:21
User aborted: stopping task (3)
2023-05-06 00:19:21
Successfully installed PyYAML-6.0 attrs-22.2.0 certifi-2022.12.7 charset-normalizer-3.1.0 clearml-agent-1.5.2 distlib-0.3.6 filelock-3.12.0 furl-2.1.3 idna-3.4 jsonschema-4.17.3 orderedmultidict-1.0.1 pathlib2-2.3.7....

2 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

To do this, I think I need to know:

Can you trigger a pre-existing Pipeline via the ClearML REST API? I'd want to have a Lambda function trigger the Pipeline for a batch without needing to have all the Pipeline code in the lambda function. Something like curl -u '<clearml credetials>' None ,...
[probably a big ask] If the pipeline succeeds/fails, can ClearML emit an event that I can react to? Like mayb...

2 years ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

When you run the docker-compose.yml on an EC2 instance, you can configure user login for the ClearML webserver. But the files API is still open to the world, right? (and same with the backend?)

We could solve this by placing the EC2 instance into a VPN.

One disadvantage to that approach is it becomes annoying to reach the model registry from outside the VPN, like if you have a deployment pipeline based in GitHub Actions. Or if you wanted to trigger a ClearML pipeline from a VPC that isn...

2 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

So here's a snippet from my aws_autoscaler.yaml file

2 years ago

0 If I Want To Run Tensorflow (Version 2.10.0 With Python 3.8) With The Aws Autoscaler, Which Ami And Docker Base Image Should I Choose?

Oh, right... the Docker image running on the instance takes care of the library versions. You guys are great!

2 years ago

0 How Would Ya'Ll Approach Backing Up The Elastic-Search/Redis/Etc. Data In Self-Hosted Clearml? Any Drawbacks/Risks Of Doing A Simple Process That Periodically Zips Up The

You know, you could probably add some immortal containers to the docker-compose.yml that use images with mongodump and the ES equivalent installed.

The container(s) could have a bash script with a while loop in it that sleeps for 30 minutes and then does a backup. If you installed the AWS CLI inside, it could even take care of uploading to S3.

I like this idea, because docker-compose.yml could make sure that if the backup container ever dies, it would be restarted.

2 years ago

Yes, it's pretty lame that a clearml-agent can only process one task at a time if it's not listening to a services queue 🤔

2 years ago

My understanding may be bad. Say I have a single EC2 instance. Is that instance only able to handle one task at a time?

Or can I start multiple instances of the clearml-agent process on it and then have one task per agent?

And if that's the case, can we have multiple agents on the EC2 instance listening to the same queue, e.g. default . Or would this only work if they were listening to different queues?

2 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

So I get output with this one, but the console only shows me the output from my machine. For example, the SSH key is present, and whoami results in ericriddoch

2 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Let's see. The task log? I think this is it.

2 years ago

0 Can You Help Me Make The Case For Clearml Pipelines/Tasks Vs Metaflow? Context Within...

Thanks for replying Martin! (as always)

Do you think ClearML is a strong option for running event-based training and batch inference jobs in production? That’d include monitoring and alerting. I’m afraid that Metaflow will look far more compelling to our teams for that reason.

Since it deploys onto step functions, the scheduling is managed for you and I believe alerts for failing jobs can be set up without adding custom code to every pipeline.

If that’s the case, then we’d probably only...

2 years ago

0 Hey Friends, How Do You Configure Clearml To Use An S3 Bucket? Specifically: Does

Thanks Vasil! Can you elaborate on what you mean by using boto3? Do you mean writing a script using boto that pulls the credentials down and writes to the user's clearml.conf

Also, I've been seeing references to "credentials vault" in the docs. I can see this is the problem that it solves.

2 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

If this works, we might be able to fully replace Metaflow with ClearML!

(Refering to the feature where Metaflow creates Step Functions state machines for you, and then you can use those to trigger event-driven batch jobs in the same way described here)

2 years ago

0 Hey

But the extension will need credentials to connect to it.

one year ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Trying as a python subprocess...

2 years ago

0 Hey

Oh wow. If this works, that will be insanely cool. Like, I guess what I'm going for is that if I specify "username: test" and "password: test" in that file, that I can specify "api.access_key: test" and "api.secret_key: test" in the clearml.conf used for CI. I'll give it a try tonight!

one year ago

0 Hi Team! Is There A Way To Make Clearml’S Aws Autoscaler And Queues Resource-Aware Please? I.E. If We Can Say, As We Enqueue Our Job, How Much Ram Or Gpu-Ram Or Even Gpus It Needs, Have The Scheduler/Autoscaler Dispatch The Job To Instances That Are Of Th

Thank you! I think it does. It’s just now dawning on me that: because a pipeline is composed of multiple tasks, different tasks in the pipeline could run on different machines. Or more specifically, they could run on different queues, and as you said, in your other response, we could have a Q for smaller CPU-based instances, and another queue larger GPU-based instances.

I like the idea of having a queue dedicated to CPU-based instances that has multiple agents running on it simultaneously....

2 years ago

This thread should be immortalized. Super stoked to try this out!

2 years ago

0 Aws Autoscale Question: Can The Autoscaler Use The Iam Role Of The Ec2 Instance

Ah, okay thanks!

2 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

I can't think of any changes we might have made on our side to cause that 🤔

2 years ago

0 I'M Getting Some Weird Clearml Behavior. I'Ve Deployed It To An Ec2 Instance. When I Access

Totally worked!

2 years ago

0 I'M Getting Some Weird Clearml Behavior. I'Ve Deployed It To An Ec2 Instance. When I Access

Will do!

2 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

I have the same behavior whether or not I put task.execute_remotely(...) before or after the call to run_shell_script()

2 years ago

0 Hey Friends, How Do You Configure Clearml To Use An S3 Bucket? Specifically: Does

Yay! Man, I want to do ClearML with "hard mode" (non-enterprise, self-hosted) first, before trying to sell BENlabs (my work) on it. I could see us paying for enterprise to get the Hyper Datasets and Vault features if our scientists/developers fall in love with it--they probably will if we can get them to adopt it since right now we have a homemade system that isn't nearly as nice as ClearML.

@<1523701087100473344:profile|SuccessfulKoala55> how exactly do you configure ClearML to use the cr...

2 years ago

0 Sorry For Always Posting Such Cryptic Problems. I Managed To Create A Docker-Compose File That Runs Clearml

I've also tried running a clearml-agent daemon directly on my mac (not in docker) serving the sessions queue for the ClearML server that is running in docker. When I do that, it consistently fails with a different error. Something to do with mounting a volume.

one year ago

Show more results