ClumsyElephant70

13 Questions, 89 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

70 × Eureka!

Questions 13
Answers 89

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

Hi, I would like to understand how I can set the pip cache location for my agent, I thought that I already had the right setting with docker_internal_mounts....

clearml

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Let’S Imagine I’M Building A Pipeline With Five Consecutive Steps, Where Some Of The Steps Are Non Ml/Dl Based. Using Clearml I Run A Lot Of Experiments To Find The Right Pipeline Configuration. After I Found The Right Algorithms And Parameters For My Pip

Let’s imagine I’m building a pipeline with five consecutive steps, where some of the steps are non ML/DL based. Using ClearML I run a lot of experiments to f...

clearml

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Imagine I Browse Through My Experiment History And Find An Old Experiment That I Want To Use As A Base For A New Experiment. I Did Not Commit All My Changes Before Executing This Old Experiment, So The "Uncommitted Changes" In The "Execution Tab" Is Not

Imagine I browse through my experiment history and find an old experiment that I want to use as a base for a new experiment. I did not commit all my changes ...

clearml

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi, Are There Other Ways To Add

Hi, are there other ways to add package_manager.extra_index_urls to my agents besides configuring them through the clearml.conf file?

clearml

3 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Hi all, I have an Elasticsearch problem on my ClearML server. The error message I get on the ClearML webapp is General data error (TransportError(503, 'searc...

clearml

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi, How Can I Use

Hi, how can I use package_manager.force_repo_requirements_txt=true in a mono repository structure? like repo/project-a/requirements.txt , repo/project-b/requ...

clearml

4 years ago

0 Votes

20 Answers

2K Views

0 Votes 20 Answers 2K Views

Hey I’M Running This Script And Initialise The Clearml Task Also In This File

Hey I’m running this script and initialise the ClearML task also in this file https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/scripts...

clearml

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hey, I'M Trying To Get The Google Cloud Platform Credentials As A

Hey, I'm trying to get the Google Cloud Platform Credentials as a .json file inside my dockerized clearML agents. I was able to copy those credentials from t...

clearml

4 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Any Idea Why I Get This Error In All My Agents

Any idea why I get this error in all my agents clearml_agent: ERROR: APIError: code 400/707: No queue is tagged as the default queue for this company

clearml

4 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi, Are There Any Plans Or Already Ways To Deploy A Pipeline With Clearml-Serving To Triton? I Would Also Be Interested In The Support Of Deploying Pure Python Models Using The New Python_Backend Of Triton.

Hi, are there any plans or already ways to deploy a pipeline with clearml-serving to triton? I would also be interested in the support of deploying pure pyth...

clearml

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi, I Want To Pass Environment Variables From The Host To The Docker Containers Running My Task. I Managed To Use

Hi, I want to pass environment variables from the host to the docker containers running my task. I managed to use extra_docker_shell_script: ["export SECRET=...

clearml

4 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Hey, I’M Getting The Following Error When Loading A Model Using Model.Get_Local_Copy()

Hey, I’m getting the following error when loading a model using model.get_local_copy() … raise ValueError("Could not retrieve a local copy of model weights {...

mlops

4 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hey, Is There A Way To Limit The Number Of Tasks Run At The Same Time By An Agent In Service Mode?

Hey, is there a way to limit the number of tasks run at the same time by an agent in service mode?

mlops

3 years ago

0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

RuntimeError: stack expects each tensor to be equal size, but got [15, 640, 372, 2] at entry 0 and [15, 322, 640, 2] at entry 1 Detected an exited process, so exiting main terminating child processes exiting

4 years ago

0 Hi, Are There Any Plans Or Already Ways To Deploy A Pipeline With Clearml-Serving To Triton? I Would Also Be Interested In The Support Of Deploying Pure Python Models Using The New Python_Backend Of Triton.

CostlyOstrich36 Thank you for your response, is there something like a public project roadmap?

4 years ago

0 Hey All. Quick Question About The

When using clearml-agent daemon --queue default --docker it is running. In this case I always had some issues when adding the --gpu flag.

4 years ago

0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

So I don't need docker_internal_mounts at all?

3 years ago

0 Any Idea Why I Get This Error In All My Agents

It is working now, it seemed like I pointed to a wrong entrypoint.sh in the docker-compose file. Still strange...

4 years ago

0 Hi, Our Server Ip Address Has Changed, And This Breaks All The Paths To Artifacts / Datasets. Is There A Way To Fix The Old Paths So That They Can Be Accessed Again? Thank You!

Thanks

4 years ago

0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

probably found the issue

3 years ago

0 Hey, Is There A Way To Limit The Number Of Tasks Run At The Same Time By An Agent In Service Mode?

I'm running the following agent:
clearml-agent --config-file /clearml-cache/config/clearml-cpu.conf daemon --queue cpu default services --docker ubuntu:20.04 --cpu-only --services-mode 4 --detached
The goal is to have an agent that can run multiple cpu only tasks at the same time. I notices that when enqueueing multiple tasks, all except for one stay pending until the first one finished downloading all packages and started with code execution. And then task by task switch to "run...

3 years ago

0 Hey All. Quick Question About The

yes, this one is running in venv and not docker, because I had some issues with cuda and docker. The virtualenv==20.4.6 in the requirements.txt. I think it broke after installing clearml-serving in the same env.

4 years ago

0 Hey, Is There A Way To Limit The Number Of Tasks Run At The Same Time By An Agent In Service Mode?

We run a lot of pipelines that are cpu only with some parallel steps. Its just about improving the execution time

3 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Can you send some more comprehensive log - perhaps there are other messages that are related

which logs do you wish?

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

The output seen above indicates that the index is corrupt and probably lost, but that is not necessary the case

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Yes, this happened when the disk got filled up to 100%

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

I increased already the memory to 8GB after reading similar issues here on the slack`

Just making sure, how exactly did you do that?

docker-compose down
elasticsearch: networks: - backend container_name: clearml-elastic environment: ES_JAVA_OPTS: -Xms8g -Xmx8g `` docker-compose up -d

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Did you wait for all the other indices to reach yellow status?

yes I waited until everything was yellow

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

I will try to recover it, but anyway the learning is to fully separate the fileserver and any output location from mongo, redis and elastic. Also maybe it makes sense the improve the ES setup to have replicas

4 years ago

0 Hey All. Quick Question About The

without content

4 years ago

0 Hi, Our Server Ip Address Has Changed, And This Breaks All The Paths To Artifacts / Datasets. Is There A Way To Fix The Old Paths So That They Can Be Accessed Again? Thank You!

SuccessfulKoala55 do you have any example? I guess a lot of people face this issue

4 years ago

0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

W: chown to _apt:root of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (1: Operation not permitted) W: chmod 0700 of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (1: Operation not permitted) Collecting pip==20.1.1

3 years ago

0 Hey All. Quick Question About The

AgitatedDove14 I created a new clean venv and freshly installed the clearml-agent under python / pip 3.8 and now it is working again. Still don't know what caused this issue. Thank you very much for helping!

4 years ago

0 Hey All. Quick Question About The

` Process failed, exit code 1task ab1a90dacb9042eea8e4a6a16640d7f4 pulled from 8f06b6b160c14a3591d791c1885b309e by worker test:gpu1
Running task 'ab1a90dacb9042eea8e4a6a16640d7f4'
Storing stdout and stderr log to '/tmp/.clearml_agent_out.kbkz1n40.txt', '/tmp/.clearml_agent_out.kbkz1n40.txt'
Current configuration (clearml_agent v1.0.0, location: /tmp/.clearml_agent.3e6l7juj.cfg):

sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes ...

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Solving the replica issue now allowed me to get better insights into why the one index is red.
` {
"index" : "events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2021-11-09T22:30:47.018Z",
"last_allocation_status" : "no_valid_shard_copy"
},
"can_allocate" : "no_valid_shard_copy",
"allocate_explanation" : "cannot allocate because a...

4 years ago

0 Hi, Our Server Ip Address Has Changed, And This Breaks All The Paths To Artifacts / Datasets. Is There A Way To Fix The Old Paths So That They Can Be Accessed Again? Thank You!

I think Anna means that if artifacts and models are stored on the clearml fileserver their path will contain the IP or domain of the fileserver. If you then move the fileserver to a different host, all the urls are broken since the host changed.

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

curl -XPUT -H 'Content-Type: application/json' 'localhost:9200/_settings' -d '{"index" : {"number_of_replicas" : 0}}This command made all my indices beside the broken one which is still red, come green again. It comes from https://stackoverflow.com/questions/63403972/elasticsearch-index-in-red-health/63405623#63405623 .

4 years ago

0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

Exactly, all agents should share the cache that is mounted via nfs. I think it is working now 🙂

3 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

SuccessfulKoala55 so you say deleting other old indices that I don't need could help?

4 years ago

0 Hey All. Quick Question About The

docker run --gpus device=0 --rm -it nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu18.04 bash worked, I could run in it nvidia-smi and see gpu 0

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

using top inside the elasticsearch container shows elastic+ 20 0 17.0g 8.7g 187584 S 2.3 27.2 1:09.18 java that the 8g are reserved. So setting ES_JAVA_OPTS: -Xms8g -Xmx8g should work.

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

` elasticsearch:
networks:
- backend
container_name: clearml-elastic
environment:
ES_JAVA_OPTS: -Xms8g -Xmx8g
bootstrap.memory_lock: "true"
cluster.name: clearml
cluster.routing.allocation.node_initial_primaries_recoveries: "500"
cluster.routing.allocation.disk.watermark.low: 500mb
cluster.routing.allocation.disk.watermark.high: 500mb
cluster.routing.allocation.disk.watermark.flood_stage: 500mb
discovery.zen.minimum_master_no...

4 years ago

0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

since it is a single node, I guess it will not possible to recover or partially recover the index right?

4 years ago

Show more results