
Reputation
Badges 1
129 × Eureka!If this works, we might be able to fully replace Metaflow with ClearML!
(Refering to the feature where Metaflow creates Step Functions state machines for you, and then you can use those to trigger event-driven batch jobs in the same way described here)
I could potentially write a selenium script to make a set of keys, but I'd prefer to avoid that π
Does this mean that none of the credientials in this file can be used with the clearml SDK when the docker-compose.yaml starts up with a fresh state?
Is there anyway to achieve such a behavior? Or are manual steps simply required to get a working set of keys. I'm trying to prepare a docker-compose file that I can use for automated tests of our VS Code extension.
But I actually wish the interface were more like the apiserver.conf
file--specifically, that you can define hard-coded credentials in this file in advance. Except, I wish that you could define API keys this way (or some other way)
auth {
# Fixed users login credentials
# No other user will be able to login
fixed_users {
enabled: true
pass_hashed: false
users: [
{
username: "test"
password: "test"
...
I don't know that you'd have to pre-build credentials into docker. If you could specify a set of credentials as environment variables to the docker run ...
command or something, that would work just fine.
The goal is to be able to run docker-compose up
in CI, which starts a clearml-server. And then make several API calls to the started ClearML server to prove that the VS Code extension code is working.
Examples:
- Assert that the extension can auth with ClearML
- Assert that the ext...
Iβd really prefer it was modular enough to use serving with any model registry
Oh that's interesting. To serve a model from MLflow, would you have to copy it over to ClearML first?
This is a low-key open-source project if anyone wanted to contribute. Since the project is early, there are lots of high-impact things, e.g. UI polish, that would be relatively low effort π
Disclaimer: I'm not familiar enouch with the ClearML codebase to vouch for the quality of this PR, although it is short which is typically good . The feature we're interested in is the ability to specify the subnet_id
.
Interesting . Itβs actually just running locally on my laptop. It seemed only to be an issue when pointing the ClearML session CLI at my local version of ClearML. Still thinking about this one.
Thanks Vasil! Can you elaborate on what you mean by using boto3? Do you mean writing a script using boto that pulls the credentials down and writes to the user's clearml.conf
Also, I've been seeing references to "credentials vault" in the docs. I can see this is the problem that it solves.
I may be able to prepare a PR that only allows specifying the subnet ID. Can you help me brainstorm scenarios youβd want to see tested? Also, do these need to be automated tests?
I'm trying to add a docker-compose.yaml
to the repo to
- make it more convenient for contributors to develop locally
- spin up a local ClearML instance in CI to run automated tests
Here's the docker-compose file (mostly the standard file, except I altered the volume mounts, and I added minIO)
Here's [the clearml.conf file](https://github.com/mlops-club/vscode-clearml-sessi...
@<1557175205510516736:profile|ShallowSwan53> at this point, I think this question deserves it's own thread. I'm curious about it too!
possibly cheaper on the cloud (Lambda vs EC2 instance)
Whoa, are you saying there's an autoscaler that doesn't use EC2 instances? I may be misunderstanding, but that would be very cool.
Maybe I should have said: my plan is to use AWS StepFunctions where a single task in the DAG is an entire ClearML pipeline . The non-ClearML steps would orchestrate putting messages into a queue, doing retry logic, and triggering said pipeline.
I think at some point, there has to be some amount of...
@<1594863216222015488:profile|ConvincingGrasshopper20> throwing this out there... would you want to make this with me at the Hackathon??
Here's a docker-compose I've been playing with. It doesn't have the same restart problem you're describing, but I did change the volume mounts: None
Is there some way we could programmatically list all current ClearML sessions?
We need a way to do that, maybe with the clearml-session
CLI in order to populate the VS Code extension menu.
Hmm... these people are recommending restarting docker completely. I may have tried that already, but I'll do it again when I get some time to be sure.
If the load balancer it Gateway can do the computation and leverage caching, weβre much safer against DDOS attacks. In general, Iβd prefer not to have our EC2 instance directly exposed to the public Internet.
Thanks for the response @<1523701205467926528:profile|AgitatedDove14> !
What would you consider an event?
I was thinking of the TriggerScheduler
's definition of an event. Pretty much, any thing the TriggerSchedule allows you to react to, it'd be great to be able to publish those events to a queue external to ClearML, e.g. a tag added to a model (or removed), a state in a task changing, etc. We'd want as much metadata about that event as possible. So if the event is due to a task...
That's fabulous. This is definitely how my team prefers to structure projects. I hadn't gotten around to trying that out in our POC of ClearML yet, but I'm certain this is how our group will solve this problem
I've also tried running a clearml-agent daemon
directly on my mac (not in docker) serving the sessions
queue for the ClearML server that is running in docker. When I do that, it consistently fails with a different error. Something to do with mounting a volume.
Oh wow. If this works, that will be insanely cool. Like, I guess what I'm going for is that if I specify "username: test" and "password: test" in that file, that I can specify "api.access_key: test" and "api.secret_key: test" in the clearml.conf used for CI. I'll give it a try tonight!
Oh! System tags! That would definitely have been a better way to do it. We ended up querying for tasks in the "DevOps" project with the name "Interactive Session"
My understanding may be bad. Say I have a single EC2 instance. Is that instance only able to handle one task at a time?
Or can I start multiple instances of the clearml-agent
process on it and then have one task per agent?
And if that's the case, can we have multiple agents on the EC2 instance listening to the same queue, e.g. default
. Or would this only work if they were listening to different queues?
And for the session
clearml-session --queue sessions --docker python:3.9