BoredHedgehog47

27 Questions, 213 Answers

Active since 10 January 2023

Last activity 8 months ago

Reputation

Badges 1

212 × Eureka!

Questions 27
Answers 213

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

In A Nutshell, What Do I Need For The Clearml Agent To Scale Ec2 Nodes In The K8 Cluster, In Terms Of Helm Configuration? I Assume Aws Credentials, Is There Anything Else?

In a nutshell, what do I need for the clearML agent to scale EC2 nodes in the k8 cluster, in terms of helm configuration? I assume AWS credentials, is there ...

mlops

2 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

When I Run

When I run clearml-data close on an 84mb file, I get the following response 413 Request Entity Too Large 413 Request Entity Too Large nginxYet the file is st...

dataset

2 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Was There Any Changes To Clearml Python Sdk In The Past 24 Hours?

Was there any changes to clearML python SDK in the past 24 hours?

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

If I Have An Aws Key/Secret For An Iam User, What Is The Best Way To Pass In These Credentials So The Task Docker Container Has Credentials Generated For Usage With Boto3?

If I have an AWS key/secret for an IAM user, what is the best way to pass in these credentials so the task docker container has credentials generated for usa...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

Why am I getting a 403 from file server when the k8 glue agent is initializing ?

mlops

2 years ago

0 Votes

4 Answers

956 Views

0 Votes 4 Answers 956 Views

Yesterday I Executed An Experiment In Our Hosted Clearml Cluster. After The Experiment Was Finished, We Got An Aws Guard Duty Notification About Suspicious Outbound Traffic From The Ec2 That Executed The Job. It Looks Like The Tag Being Used Is Hardcoded

Yesterday I executed an experiment in our hosted clearML cluster. After the experiment was finished, we got an AWS guard duty notification about suspicious o...

clearml

2 years ago

0 Votes

2 Answers

992 Views

0 Votes 2 Answers 992 Views

If I Leave

If I leave WORKING DIRECTORY empty in the experiment configuration (in the UI), will that use the git project root by default?

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

What Does This Log Message Mean

What does this log message mean ClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-start ?

clearml

2 years ago

0 Votes

7 Answers

996 Views

0 Votes 7 Answers 996 Views

Is There Any Additional Configuration Needed For

Is there any additional configuration needed for PYTHONPATH to be setup properly in the clearml agent? I'm getting python import errors from the root directo...

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Where I Can Change This Host Name Using The Helm Charts? I Got This Error When My Task Is Fetching A Dataset.

Where I can change this host name using the helm charts? I got this error when my task is fetching a dataset. 2022-09-23 15:09:45,318 - clearml.storage - ERR...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

When I Run An Experiment (Self Hosted), I Only See Scalars For Gpu And System Performance. How Do I See Additional Scalars? I Have

When I run an experiment (self hosted), I only see scalars for GPU and System performance. How do I see additional scalars? I have "tensorboard": { "enabled"...

tensorboard

2 years ago

0 Votes

16 Answers

1K Views

0 Votes 16 Answers 1K Views

When I Try To Create Experiment In The Ui All I See Is This Dialogue

When I try to create experiment in the UI all I see is this dialogue

clearml

2 years ago

0 Votes

14 Answers

1K Views

0 Votes 14 Answers 1K Views

Does Clearml Have The Ability To Run A Single Experiment Across Multiple Nodes/Gpus In A K8 Cluster?

Does ClearML have the ability to run a single experiment across multiple nodes/GPUs in a k8 cluster?

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Also Is This The Image That Is Used For Experiments?

Also is this the image that is used for experiments? https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/values.yaml#L38-L39

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hey Everyone, I'M Trying To Add A Test User To The Api Server Config. Here Is A Snippet Form My Values.Yaml File. Do I Have This Formatted Correctly? I'M Not Seeing Api Config Map In The K8 Cluster

Hey everyone, I'm trying to add a test user to the api server config. Here is a snippet form my values.yaml file. Do I have this formatted correctly? I'm not...

clearml

2 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

How Do I Create An Experiment Where I Can Set The Github Repository/Branch Name/Script Path Like This Example Shows?

How do I create an experiment where I can set the github repository/branch name/script path like this example shows?

clearml

2 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

I'M New To Using Datasets, If My Git Project Root Is

I'm new to using datasets, if my git project root is myProject and I expect file.json to be at the root level, how do I accomplish this?

clearml

2 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

Is there any examples of mounting an AWS EFS mount to a self hosted k8 agent deploy? https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearm...

mlops

2 years ago

0 Votes

17 Answers

1K Views

0 Votes 17 Answers 1K Views

Or Is It Just The Ubuntu Official Image

or is it just the ubuntu official image https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/values.yaml#L59

clearml

2 years ago

0 Votes

14 Answers

1K Views

0 Votes 14 Answers 1K Views

Hey All, Is There Any Reason The Python Sdk

Hey all, is there any reason the python sdk clearml would cause subprocess issues? I'm calling returncode = Popen(cmd).wait() and getting File "/usr/lib64/py...

clearml

2 years ago

0 Votes

15 Answers

1K Views

0 Votes 15 Answers 1K Views

Hey All, I'M Testing The Usage Of

Hey all, I'm testing the usage of SETUP SHELL SCRIPT in the experiment window. I added a simple command but did not see it in the console. The task did execu...

clearml

2 years ago

0 Votes

13 Answers

1K Views

0 Votes 13 Answers 1K Views

Apiserver: Service: Type: Clusterip Configuration: Additionalconfigs: Apiserver.Conf: | Auth { Fixed_Users { Enabled: True Pass_Hashed: False Users: [ {

apiserver: service: type: ClusterIP configuration: additionalConfigs: apiserver.conf: | auth { fixed_users { enabled: true pass_hashed: false users: [ { user...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

When I Do

When I do Dataset.get why does the SDK use clearml.storage - ERROR - Could not download http://files.clearml.myhost.com vs using what is defined on the agent...

mlops

2 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

In My Git Repo, I Have A

In my git repo, I have a setup.py , how would I run pip install -e . rather than using --packages or --requirements

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

In Order For A New Worker To Come Online In My K8 Cluster, Do I Need To Have An Ec2 Startup Script Init The Agent/Config, And Then Start The Daemon? Do I Have To Do This Manually Is This A Better Way?

In order for a new worker to come online in my k8 cluster, do I need to have an EC2 startup script init the agent/config, and then start the daemon? Do I hav...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

I'M Trying To Configure The Glue Agent To Use Aws Ecr Via Helm Charts. Below Is My Configuration. It Is Not Pulling The Image Though, It Is Failing With

I'm trying to configure the glue agent to use AWS ECR via helm charts. Below is my configuration. It is not pulling the image though, it is failing with K8S ...

mlops

2 years ago

0 Votes

31 Answers

20K Views

0 Votes 31 Answers 20K Views

When My Remote Task Is Installing The Python Dependencies

When my remote task is installing the python dependencies --packages requests for example, is there any caching "magic" that is done by the k8 agent? Or is i...

clearml

2 years ago

0 Is There Any Additional Configuration Needed For

for example, if my github repo is project.git and my structure is project/utils/tool.py

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

Okay, this worked --packages '-e .'

2 years ago

0 I'M New To Using Datasets, If My Git Project Root Is

so it caches to ~/.clearml/ any files that are under the same project name?

2 years ago

0 Does Clearml Have The Ability To Run A Single Experiment Across Multiple Nodes/Gpus In A K8 Cluster?

AgitatedDove14 How do I setup a master task to do all the reporting?

2 years ago

0 When I Do

I don't know how to get past this? My k8 pods shouldn't need to reach out to the public file server URL.

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

Traceback (most recent call last): File "sfi/imagery/models/training/ldc_train_end_to_end.py", line 26, in <module> from sfi.imagery.models.chip_classifier.eval import eval_chip_classifier ModuleNotFoundError: No module named 'sfi.imagery.models'

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

` SysPath: ['/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py/sfi/imagery/models/training', '/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py/sfi', '/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py', '/usr/lib64/python37.zip', '/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/home/npuser/.clearml/venvs-builds/3.7/lib64/python3.7/site-packages', '/home/npuser/.clearml/venvs-builds/3.7/l...

2 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

I think if I use the local service URL this problem is fixed

2 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

It seems like https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/values.yaml#L72-L80 doesn't actually do anything as the values set here aren't applied in the agent template

2 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

I don't see any requests

2 years ago

0 When I Do

So this is an additional config file with enterprise? Is this new config file deployable via helm charts?

2 years ago

0 Apiserver: Service: Type: Clusterip Configuration: Additionalconfigs: Apiserver.Conf: | Auth { Fixed_Users { Enabled: True Pass_Hashed: False Users: [ {

2 years ago

0 Does Clearml Have The Ability To Run A Single Experiment Across Multiple Nodes/Gpus In A K8 Cluster?

SuccessfulKoala55 Darn, so I can only scale vertically?

2 years ago

0 I'M New To Using Datasets, If My Git Project Root Is

After proving we can run our training, I would then advise we update our code base

2 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

I got the EFS volume mounted. Curious what advantage it would be to use the StorageManager

2 years ago

0 I'M New To Using Datasets, If My Git Project Root Is

I assumed I would need to upload it and then reference it somehow?

2 years ago

0 Apiserver: Service: Type: Clusterip Configuration: Additionalconfigs: Apiserver.Conf: | Auth { Fixed_Users { Enabled: True Pass_Hashed: False Users: [ {

Based on my yaml

2 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

` * Serving Flask app 'fileserver' (lazy loading)

Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
Debug mode: off
[2022-09-08 13:24:25,822] [8] [WARNING] [werkzeug] * Running on all addresses.
WARNING: This is a development server. Do not use it in a production deployment. `

2 years ago

0 Apiserver: Service: Type: Clusterip Configuration: Additionalconfigs: Apiserver.Conf: | Auth { Fixed_Users { Enabled: True Pass_Hashed: False Users: [ {

Thanks for looking into this!

2 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

I think this is VPN related now

2 years ago

0 Where I Can Change This Host Name Using The Helm Charts? I Got This Error When My Task Is Fetching A Dataset.

err maybe not, I dont know where its being fetched

2 years ago

0 When I Do

I don't recognize that file?

2 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

These are the logs from the fileserver pod

2 years ago

0 Is There Any Additional Configuration Needed For

they are pathing errors to in my repo

2 years ago

0 How Do I Create An Experiment Where I Can Set The Github Repository/Branch Name/Script Path Like This Example Shows?

gotcha, I see how that is populated now. So then if my workers have git credentials, a user can clone that experiment and run on a worker?

2 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

I used the values from the dashboard/configuration/api keys

2 years ago

0 Or Is It Just The Ubuntu Official Image

The task pod (experiment) started reaching out to an IP associated with malicious activity. The IP was associated with 1000+ domain names. The activity was identified in AWS guard duty with a high severity level.

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

If you look lower, it is there '/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py'

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

` PYTHONPATH: /home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py/sfi:/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py:/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py/sfi/imagery/models/training::/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py/sfi:/usr/lib64/python37.zip:/usr/lib64/python3.7:/usr/lib64/python3.7/lib-dynload:/home/npuser/.clearml/venvs-builds/3.7/lib6...

2 years ago

Show more results