BattyCrocodile47

35 Questions, 147 Answers

Active since 02 March 2023

Last activity 3 months ago

Reputation

Badges 1

129 × Eureka!

Answers 147

0 More Of Pushing Clearml To It'S Data Engineering Limits

I took a stab at writing an automated trigger to handle this. The goal is: anytime a pipeline succeeds or fails, let AWS know so that the input records can be placed onto a retry queue (or not)

I'm trying to get a trigger to work in general, and then I'll add the more complex AWS logic. But I seem to be missing a step somewhere:

I wrote a file called set_triggers.py

from clearml.automation.trigger import TriggerScheduler

TRIGGER_SCHEDULER = TriggerScheduler()

from pprint import...

2 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

possibly cheaper on the cloud (Lambda vs EC2 instance)

Whoa, are you saying there's an autoscaler that doesn't use EC2 instances? I may be misunderstanding, but that would be very cool.

Maybe I should have said: my plan is to use AWS StepFunctions where a single task in the DAG is an entire ClearML pipeline . The non-ClearML steps would orchestrate putting messages into a queue, doing retry logic, and triggering said pipeline.

I think at some point, there has to be some amount of...

2 years ago

0 I'M Getting Some Weird Clearml Behavior. I'Ve Deployed It To An Ec2 Instance. When I Access

None

2 years ago

0 Working On The Vs Code Extension. Pretty Stumped On This One...

Interesting . It’s actually just running locally on my laptop. It seemed only to be an issue when pointing the ClearML session CLI at my local version of ClearML. Still thinking about this one.

one year ago

0 Working On The Vs Code Extension. Pretty Stumped On This One...

While I'm wishing for things: it'd be awesome if it had a queue already set up. But if there's not a way to do that in the docker compose file, I could potentially write a script that uses the creds to create one using API calls

one year ago

0 Hey

Exactly

one year ago

0 Sorry For Always Posting Such Cryptic Problems. I Managed To Create A Docker-Compose File That Runs Clearml

Hmm... these people are recommending restarting docker completely. I may have tried that already, but I'll do it again when I get some time to be sure.

one year ago

0 Clearml Tracks The Executed

Oh duh, thanks. What about non standard entrypoints (as opposed to arguments) like accelerate launch train.py ?

one year ago

0 Whelp. Here'S Our Hackathon Demo Submission For A Clearml Vs Code Extension

Thank you! For now, it's kind of nice that it just picks up your credentials from your conf file. No extra setup required beyond the onboarding ClearML has you do 😄

And look! It's working, assuming you start the clearml session up yourself:

one year ago

0 Hey

https://share.descript.com/view/g0SLQTN6kAk

one year ago

0 Hey

Oh I wasn’t aware of that. I don’t think it’d work for this use case though. We’re trying to test the behavior you can see here in this extension https://share.descript.com/view/g0SLQTN6kAk so basically the examples I said in that earlier message

one year ago

0 I'M Getting Some Weird Clearml Behavior. I'Ve Deployed It To An Ec2 Instance. When I Access

Will do!

2 years ago

0 Working On The Vs Code Extension. Pretty Stumped On This One...

I'm trying to add a docker-compose.yaml to the repo to

make it more convenient for contributors to develop locally
spin up a local ClearML instance in CI to run automated tests
Here's the docker-compose file (mostly the standard file, except I altered the volume mounts, and I added minIO)
Here's [the clearml.conf file](https://github.com/mlops-club/vscode-clearml-sessi...

one year ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

The dark theme you have

It's this chrome extension ! I forget it's even on sometimes. It gives you a keyboard shortcut to toggle dark mode on any website. I love it.

Success! Wow, so this means I can use ClearML training/inference pipelines as part of AWS StepFunctions!

My plan is to have a AWS Step Functions state machine (DAG) that treats running a ClearML job as one step (task) in t...

2 years ago

0 I’M

One idea: is it possible to store usable credentials in advance and place them in a volume that the ClearML containers can access and then use?

one year ago

0 How Would Ya'Ll Approach Backing Up The Elastic-Search/Redis/Etc. Data In Self-Hosted Clearml? Any Drawbacks/Risks Of Doing A Simple Process That Periodically Zips Up The

Earlier in the thread they mentioned that the agents are all resilient. So no ongoing tasks should be lost. I imagine even in a large organization, you could afford 5-10 minutes of downtime at 2AM or something.

That said, you'd only have 1 backup per day which could be a big deal depending on the experiments your running. You might want more than that.

one year ago

0 Hey

Oh! System tags! That would definitely have been a better way to do it. We ended up querying for tasks in the "DevOps" project with the name "Interactive Session"

one year ago

0 I’M

I SOLVED IT, NO NEED TO READ FURTHER 😄

I'm a chump and didn't read the docs: None

Oh, I think I got overexcited and didn't look at this closely. So this ACCESS/SECRET key pair is on the agent-services container.

I can see that agent-services is simply a container running `clearml-agent daemon --queue ser...

one year ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

I'm imagining:

The EC2 instance would be in a private subnet, accessible only on the VPN (read: VPC)
The API Gateway and Load Balancer would also be on the VPC and therefore have access to the private subnet BUT the API Gateway or Load Balancer themselves would be exposed to the public internet.
That way, to do the JWT authentication, the load balancer or API Gateway could reach out to the EC2 instance on the private network to authenticate any incoming ClearML SDK requests.

2 years ago

0 Hi Friends, We Got On A Sales Call With Clearml Yesterday And A Discussion About Webhooks Came Up.

Thanks for the response @<1523701205467926528:profile|AgitatedDove14> !

What would you consider an event?

I was thinking of the TriggerScheduler 's definition of an event. Pretty much, any thing the TriggerSchedule allows you to react to, it'd be great to be able to publish those events to a queue external to ClearML, e.g. a tag added to a model (or removed), a state in a task changing, etc. We'd want as much metadata about that event as possible. So if the event is due to a task...

one year ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

When you run the docker-compose.yml on an EC2 instance, you can configure user login for the ClearML webserver. But the files API is still open to the world, right? (and same with the backend?)

We could solve this by placing the EC2 instance into a VPN.

One disadvantage to that approach is it becomes annoying to reach the model registry from outside the VPN, like if you have a deployment pipeline based in GitHub Actions. Or if you wanted to trigger a ClearML pipeline from a VPC that isn...

2 years ago

0 Another Aws Autoscaler Question. The

Sorry, clarifying:

The agent-services entry in the docker-compose file seems to add a single worker to the services queue

one year ago

0 My Team Uses Metaflow By Outerbounds. Great Dag Tool. Super Robust. We Run Our Production Workloads On It And Use It For Experimentation, Too. I'M Considering Adding Clearml To Our Stack As An Exp Tracker / Model Registry Rather Than Going With The More

Thanks for this!! I may try it and if I do and it works I’ll look into writing a plugin for ZenML and Metaflow that auto initializes the parent task and registers the steps as child tasks. Super helpful thank you!

3 months ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

I have the same behavior whether or not I put task.execute_remotely(...) before or after the call to run_shell_script()

one year ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Let's see. The screenshots above are me running on the host, not attaching to a running container. So I believe I do want the keys to be mounted into the running containers.

one year ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

To do this, I think I need to know:

Can you trigger a pre-existing Pipeline via the ClearML REST API? I'd want to have a Lambda function trigger the Pipeline for a batch without needing to have all the Pipeline code in the lambda function. Something like curl -u '<clearml credetials>' None ,...
[probably a big ask] If the pipeline succeeds/fails, can ClearML emit an event that I can react to? Like mayb...

2 years ago

0 Hello, Is There Any Hope To Use Clearml-Serving Without The Clearml Server? The Tutorial And Docs Make It Seem Like It'S Required But I Wanted To Check To Be Sure. I Really Like All The Features That Clearml Provides But It Seems Like Everything Is Deep

I’d really prefer it was modular enough to use serving with any model registry

Oh that's interesting. To serve a model from MLflow, would you have to copy it over to ClearML first?

one year ago

0 Hey Friends, How Do You Configure Clearml To Use An S3 Bucket? Specifically: Does

Yay! Man, I want to do ClearML with "hard mode" (non-enterprise, self-hosted) first, before trying to sell BENlabs (my work) on it. I could see us paying for enterprise to get the Hyper Datasets and Vault features if our scientists/developers fall in love with it--they probably will if we can get them to adopt it since right now we have a homemade system that isn't nearly as nice as ClearML.

@<1523701087100473344:profile|SuccessfulKoala55> how exactly do you configure ClearML to use the cr...

2 years ago

0 Hey

Is there some way we could programmatically list all current ClearML sessions?

We need a way to do that, maybe with the clearml-session CLI in order to populate the VS Code extension menu.

one year ago

0 Does Clearml Have A Good Story For Offline/Batch Inference In Production? I Worked In The Airflow World For 2 Years And These Are The General Features We Used To Accomplish This. Are These Possible With Clearml?

This is totally what I was looking for! Yeah, by "good story for offline batch" I meant, "good feature support for ..."

I bookmarked this comment. I think I'll be doing a POC trying to show this functionality within the next month.

2 years ago

Show more results