BattyCrocodile47

Moderator

35 Questions, 147 Answers

Active since 02 March 2023

Last activity 2 months ago

Reputation

Badges 1

129 × Eureka!

Questions 35
Answers 147

0 Votes

19 Answers

1K Views

0 Votes 19 Answers 1K Views

Hey

Hey AgitatedDove14 , I saw this SO answer you gave about ClearML's docker-compose.yaml . You described getting a secret key pa...

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Aws Autoscale Question: Can The Autoscaler Use The Iam Role Of The Ec2 Instance

AWS autoscale question: can the autoscaler use the IAM role of the EC2 instance it’s running on rather than needing to be provided AWS keys?

mlops

one year ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi Friends, We Got On A Sales Call With Clearml Yesterday And A Discussion About Webhooks Came Up.

Hi friends, we got on a sales call with ClearML yesterday and a discussion about webhooks came up. ClearML seems to not natively implement webhooks It seems ...

clearml

one year ago

0 Votes

7 Answers

972 Views

0 Votes 7 Answers 972 Views

Working On The Vs Code Extension. Pretty Stumped On This One...

Working on the VS Code extension. Pretty stumped on this one...

clearml

one year ago

0 Votes

34 Answers

116K Views

0 Votes 34 Answers 116K Views

My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

My autoscaled instance fails when running "git clone" on a private repo. I do have the SSH key placed at /root/.ssh/id_rsa on the machine, and when I SSH int...

mlops

one year ago

Show more results questions

0 Hey

I ultimately resorted to creating a selenium script combined with docker-compose. Not a beautiful solution but I can confirm that it works 😕 None

one year ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

*or Gateway

2 years ago

I see. Is it possible for two agents to be utilizing the same GPU? (like if the machine has a terrific GPU, but only one of them?)

2 years ago

Yes, it's pretty lame that a clearml-agent can only process one task at a time if it's not listening to a services queue 🤔

2 years ago

0 Another Aws Autoscaler Question. The

At the time that I run python aws_autoscaler.py --remote , that clearml-services worker is the only worker on the services queue. So it will be the worker that picks up the autoscaler task.

But the task seems to be failing on startup due to the CLEARML_API_HOST not being set, but it is set for the docker container that the agent is running on.

Here's the full autoscaler log where the failure happens if that's helpful.

one year ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

possibly cheaper on the cloud (Lambda vs EC2 instance)

Whoa, are you saying there's an autoscaler that doesn't use EC2 instances? I may be misunderstanding, but that would be very cool.

Maybe I should have said: my plan is to use AWS StepFunctions where a single task in the DAG is an entire ClearML pipeline . The non-ClearML steps would orchestrate putting messages into a queue, doing retry logic, and triggering said pipeline.

I think at some point, there has to be some amount of...

2 years ago

0 Whelp. Here'S Our Hackathon Demo Submission For A Clearml Vs Code Extension

This is a low-key open-source project if anyone wanted to contribute. Since the project is early, there are lots of high-impact things, e.g. UI polish, that would be relatively low effort 😄

one year ago

0 Hey

Oh! System tags! That would definitely have been a better way to do it. We ended up querying for tasks in the "DevOps" project with the name "Interactive Session"

one year ago

0 Hi Friends, We Got On A Sales Call With Clearml Yesterday And A Discussion About Webhooks Came Up.

I could imagine other useful automations for reacting to failed tasks that have certain tags, including alerting.

I realize we could move a lot of this logic into ClearML itself: make handler functions that run within the services queue. That would work for logic that is implemented in Python. But I believe it would be harder for our team to detect and respond to failures in the event handler functions if they were placed there because it seems unclear how we could use our existing systems a...

one year ago

0 Hey

Oh I wasn’t aware of that. I don’t think it’d work for this use case though. We’re trying to test the behavior you can see here in this extension https://share.descript.com/view/g0SLQTN6kAk so basically the examples I said in that earlier message

one year ago

0 I'M Getting Some Weird Clearml Behavior. I'Ve Deployed It To An Ec2 Instance. When I Access

None

2 years ago

0 I'M Getting Some Weird Clearml Behavior. I'Ve Deployed It To An Ec2 Instance. When I Access

So the problem came back even with this new URL. I discovered clearing your cookies fixes it.

2 years ago

0 Working On The Vs Code Extension. Pretty Stumped On This One...

Interesting . It’s actually just running locally on my laptop. It seemed only to be an issue when pointing the ClearML session CLI at my local version of ClearML. Still thinking about this one.

one year ago

0 Hi Friends, We Got On A Sales Call With Clearml Yesterday And A Discussion About Webhooks Came Up.

It seems you have a specific workflow in mind, but I'm not sure I follow it. Can you give a specific example ?

Absolutely. So, let's say a DS tags a model in ClearML with "release candidate". It'd be great to have that trigger a number of processes, each with their own retry logic:

A fairness/bias evaluation, potentially as a task in ClearML itself. This would load the model and run some sample datasets through it. The
Pipeline to prepare for deployment. Trigger a GitHub Actions ...

one year ago

0 Hi Friends, We Got On A Sales Call With Clearml Yesterday And A Discussion About Webhooks Came Up.

Thanks for the response @<1523701205467926528:profile|AgitatedDove14> !

What would you consider an event?

I was thinking of the TriggerScheduler 's definition of an event. Pretty much, any thing the TriggerSchedule allows you to react to, it'd be great to be able to publish those events to a queue external to ClearML, e.g. a tag added to a model (or removed), a state in a task changing, etc. We'd want as much metadata about that event as possible. So if the event is due to a task...

one year ago

0 Hi, I'M Eric. I'M An Mlops Engineer At A Company With 9 De'S, 6 Ds'S, And 2 Mlops Engineers. I Just Learned About Clearml A Few Hours Ago And I'M Getting Excited About It!! I'M Wondering If We Could Replace Our Current Mlops Platform With Clearml. Right N

@<1557175205510516736:profile|ShallowSwan53> at this point, I think this question deserves it's own thread. I'm curious about it too!

one year ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

I'm imagining:

The EC2 instance would be in a private subnet, accessible only on the VPN (read: VPC)
The API Gateway and Load Balancer would also be on the VPC and therefore have access to the private subnet BUT the API Gateway or Load Balancer themselves would be exposed to the public internet.
That way, to do the JWT authentication, the load balancer or API Gateway could reach out to the EC2 instance on the private network to authenticate any incoming ClearML SDK requests.

2 years ago

0 Hey! Starting An Mlops Director Position In 2 Weeks. I'M Thinking About Architecture. Has Anyone Ever Tried To Use Clearml As An Experiment Tracker, But Used A Different Orchestrator Like Metaflow, Airflow, Prefect, Etc.? I'M Struggling To Find Guides Or

I've also used Airflow and Dagster in prod, but not integrated them with an exp tracker.

5 months ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

To do this, I think I need to know:

Can you trigger a pre-existing Pipeline via the ClearML REST API? I'd want to have a Lambda function trigger the Pipeline for a batch without needing to have all the Pipeline code in the lambda function. Something like curl -u '<clearml credetials>' None ,...
[probably a big ask] If the pipeline succeeds/fails, can ClearML emit an event that I can react to? Like mayb...

2 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

I took a stab at writing an automated trigger to handle this. The goal is: anytime a pipeline succeeds or fails, let AWS know so that the input records can be placed onto a retry queue (or not)

I'm trying to get a trigger to work in general, and then I'll add the more complex AWS logic. But I seem to be missing a step somewhere:

I wrote a file called set_triggers.py

from clearml.automation.trigger import TriggerScheduler

TRIGGER_SCHEDULER = TriggerScheduler()

from pprint import...

2 years ago

0 How Would Ya'Ll Approach Backing Up The Elastic-Search/Redis/Etc. Data In Self-Hosted Clearml? Any Drawbacks/Risks Of Doing A Simple Process That Periodically Zips Up The

You have no idea what is committed to disk vs what is still contained in memory.

If you ran docker-compose down and allowed ES to gracefully shut down, would ES finish writing everything to disk, therefore guaranteeing that the backups wouldn't get corrupted?

one year ago

0 Hey

I could potentially write a selenium script to make a set of keys, but I'd prefer to avoid that 😅

one year ago

0 Hey

Does this mean that none of the credientials in this file can be used with the clearml SDK when the docker-compose.yaml starts up with a fresh state?

Is there anyway to achieve such a behavior? Or are manual steps simply required to get a working set of keys. I'm trying to prepare a docker-compose file that I can use for automated tests of our VS Code extension.

one year ago

0 Hey Guys, Is There A Way To Dynamically Know The Path Of A Cloned Github Project (And Task) Inside A Docker Mode Worker? I Want To Set The Pythonpath Inside My Docker Worker, So I Can Access Local Modules, And My Only Problem Is That The Path Depends On T

That's fabulous. This is definitely how my team prefers to structure projects. I hadn't gotten around to trying that out in our POC of ClearML yet, but I'm certain this is how our group will solve this problem

2 years ago

0 Hey

I don't know that you'd have to pre-build credentials into docker. If you could specify a set of credentials as environment variables to the docker run ... command or something, that would work just fine.

The goal is to be able to run docker-compose up in CI, which starts a clearml-server. And then make several API calls to the started ClearML server to prove that the VS Code extension code is working.

Examples:

Assert that the extension can auth with ClearML
Assert that the ext...

one year ago

0 Whelp. Here'S Our Hackathon Demo Submission For A Clearml Vs Code Extension

How it works / what we finished:

We used the SaaS ClearML, started an EC2 instance, and manually installed and ran the clearml-agent daemon on it
We ran clearml-init on our laptops to generate the clearml.conf file.
The extension is in TypeScript, so...
We started trying to write code with the Python SDK to list sessions, but realized calling that from the extension would be hard, so we opted to have the TypeScript code make calls to the ClearML API server directly, e.g. ...

one year ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

If the load balancer it Gateway can do the computation and leverage caching, we’re much safer against DDOS attacks. In general, I’d prefer not to have our EC2 instance directly exposed to the public Internet.

2 years ago

0 Hi Community! I'M Facing An Issue With A Self-Hosted Clearml Server. I Modified The Docker-Compose File So To Have All The Volumes Mounted In A Specific Location (

Here's a docker-compose I've been playing with. It doesn't have the same restart problem you're describing, but I did change the volume mounts: None

one year ago

0 Hey Friends, How Do You Configure Clearml To Use An S3 Bucket? Specifically: Does

Yay! Man, I want to do ClearML with "hard mode" (non-enterprise, self-hosted) first, before trying to sell BENlabs (my work) on it. I could see us paying for enterprise to get the Hyper Datasets and Vault features if our scientists/developers fall in love with it--they probably will if we can get them to adopt it since right now we have a homemade system that isn't nearly as nice as ClearML.

@<1523701087100473344:profile|SuccessfulKoala55> how exactly do you configure ClearML to use the cr...

2 years ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

Is there a way we can protect a ClearML deployment with a load balancer or API Gateway that is exposed to the whole world, but is protected by authentication so that only authorized clients can get in?

2 years ago

Show more results compactanswers