Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
BattyCrocodile47
Moderator
34 Questions, 145 Answers
  Active since 02 March 2023
  Last activity one month ago

Reputation

0

Badges 1

127 × Eureka!
0 Can You Help Me Make The Case For Clearml Pipelines/Tasks Vs Metaflow? Context Within...

Oh this is thought provoking. Yeah, the idea of using ClearML for R&D is super appealing (to me speaking as an MLOps engineer πŸ˜† ). And having the power of Metaflow's scheduler (on Step Functions with Event Bridge since we'd do the AWS-native deployment) also makes sense to me.

I'll keep asking questions about how we could do event-based jobs with alerting built in on ClearML in a different thread later on.


I pasted your points (anonymously) onto the Metaflow slack to le...

one year ago
0 Working On The Vs Code Extension. Pretty Stumped On This One...

I'm trying to add a docker-compose.yaml to the repo to

one year ago
0 I’M

The question I'm exploring remains: is it possible to acquire that initial set of ClearML API keys programmatically so that the manual steps of 1-4 above can be avoided for an initial deployment?

one year ago
0 Hey Friends, How Do You Configure Clearml To Use An S3 Bucket? Specifically: Does

Thanks Vasil! Can you elaborate on what you mean by using boto3? Do you mean writing a script using boto that pulls the credentials down and writes to the user's clearml.conf

Also, I've been seeing references to "credentials vault" in the docs. I can see this is the problem that it solves.

one year ago
one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

The key seems to be placed in the expected location
image

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

cc: @<1565509803839590400:profile|MoodyBear54>

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

So I get output with this one, but the console only shows me the output from my machine. For example, the SSH key is present, and whoami results in ericriddoch

one year ago
one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

I have the same behavior whether or not I put task.execute_remotely(...) before or after the call to run_shell_script()

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

It's an Amazon Linux AMI with the AWS CLI pre-installed on it. It uses the AWS CLI to fetch the key from AWS SSM Parameter Store. It's granted read access to that SSM Parameter via the instance role.

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

So here's a snippet from my aws_autoscaler.yaml file

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Well wow, I figured it out. You equiped me with a solid debugging tool AKA running bash commands within the docker container.

I had to pre-add GitHub and Bitbucket to known hosts by adding keyscan commands

configurations:
  extra_clearml_conf: ""
  extra_trains_conf: ""
  extra_vm_bash_script: |
    echo "fetching github key" && (aws ssm get-parameter --region us-west-2 --name /clearml/github_ssh_private_key --with-decryption --query Parameter.Value --output text > ~/.ssh/id_rsa &...
one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I
configurations:
  extra_clearml_conf: ""
  extra_trains_conf: ""
  extra_vm_bash_script: |
    aws ssm get-parameter --region us-west-2 --name /clearml/github_ssh_private_key --with-decryption --query Parameter.Value --output text > ~/.ssh/id_rsa && chmod 600 ~/.ssh/id_rsa
    source /clearml_agent_venv/bin/activate

hyper_params:
  iam_arn: arn:aws:iam::<my account id>:instance-profile/clearml-2-AutoscaledInstanceProfileAutoScaledEC2InstanceProfile56A5348F-90fmf6H5OUBx
one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Let's see. The screenshots above are me running on the host, not attaching to a running container. So I believe I do want the keys to be mounted into the running containers.

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

So, we've been able to run sudo su and then git clone with our private repos a few times now

one year ago
one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Wow, it really does not want to show the output of those print statements in stdout. Here's the output of the task from the console after cloning it. Confirmed that the setup script and all code changes are present:

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

I can't think of any changes we might have made on our side to cause that πŸ€”

one year ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Actually, dumb question: how do I set the setup script for a task?

one year ago
0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

I'm imagining:

  • The EC2 instance would be in a private subnet, accessible only on the VPN (read: VPC)
  • The API Gateway and Load Balancer would also be on the VPC and therefore have access to the private subnet BUT the API Gateway or Load Balancer themselves would be exposed to the public internet.
    That way, to do the JWT authentication, the load balancer or API Gateway could reach out to the EC2 instance on the private network to authenticate any incoming ClearML SDK requests.
one year ago
0 Hey! Starting An Mlops Director Position In 2 Weeks. I'M Thinking About Architecture. Has Anyone Ever Tried To Use Clearml As An Experiment Tracker, But Used A Different Orchestrator Like Metaflow, Airflow, Prefect, Etc.? I'M Struggling To Find Guides Or

Dang! @<1590514584836378624:profile|AmiableSeaturtle81> awesome answer thank you! You seem like an awesome person to know. Definitely connect if you'd like to talk ops stuff sometime. None

one month ago
0 Sorry For Always Posting Such Cryptic Problems. I Managed To Create A Docker-Compose File That Runs Clearml

The agent commands are nothing special.

clearml-agent daemon --queue sessions --cpu-only --create-queue true --docker
one year ago
0 Hi Team! Is There A Way To Make Clearml’S Aws Autoscaler And Queues Resource-Aware Please? I.E. If We Can Say, As We Enqueue Our Job, How Much Ram Or Gpu-Ram Or Even Gpus It Needs, Have The Scheduler/Autoscaler Dispatch The Job To Instances That Are Of Th

Thank you! I think it does. It’s just now dawning on me that: because a pipeline is composed of multiple tasks, different tasks in the pipeline could run on different machines. Or more specifically, they could run on different queues, and as you said, in your other response, we could have a Q for smaller CPU-based instances, and another queue larger GPU-based instances.

I like the idea of having a queue dedicated to CPU-based instances that has multiple agents running on it simultaneously....

one year ago
0 Hi Team! Is There A Way To Make Clearml’S Aws Autoscaler And Queues Resource-Aware Please? I.E. If We Can Say, As We Enqueue Our Job, How Much Ram Or Gpu-Ram Or Even Gpus It Needs, Have The Scheduler/Autoscaler Dispatch The Job To Instances That Are Of Th

As an infrastructure engineer, I feel that this is a fairly significant shortcoming of ClearML.

Having the ability to pack jobs/tasks onto the same "resource" (underlying server/EC2 instance) would

  • simplify the experience for data scientists
  • open up a streaming use case, wherein batch (offline) inference could be done directly inside of a ClearML pipeline in reaction to an event/trigger (like new data landing in your data lake). As it is, you can make this work, but if you start to get ...
one year ago
0 Hi Team! Is There A Way To Make Clearml’S Aws Autoscaler And Queues Resource-Aware Please? I.E. If We Can Say, As We Enqueue Our Job, How Much Ram Or Gpu-Ram Or Even Gpus It Needs, Have The Scheduler/Autoscaler Dispatch The Job To Instances That Are Of Th

But from your other answer, I think I'm understanding that you can have multiple agents on a single instance listening to the same queue.

So we could maybe initialize 4 instances of the agent on a single EC2 instance which would allow us to handle a higher volume of small batches concurrently without tying up the entire instance.

one year ago
Show more results compactanswers