Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
ClumsyElephant70
Moderator
13 Questions, 89 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

70 × Eureka!
0 Votes
2 Answers
516 Views
0 Votes 2 Answers 516 Views
Hi, are there other ways to add package_manager.extra_index_urls to my agents besides configuring them through the clearml.conf file?
2 years ago
0 Votes
7 Answers
642 Views
0 Votes 7 Answers 642 Views
Hi, I want to pass environment variables from the host to the docker containers running my task. I managed to use extra_docker_shell_script: ["export SECRET=...
2 years ago
0 Votes
30 Answers
505 Views
0 Votes 30 Answers 505 Views
Hi, I would like to understand how I can set the pip cache location for my agent, I thought that I already had the right setting with docker_internal_mounts....
2 years ago
0 Votes
20 Answers
576 Views
0 Votes 20 Answers 576 Views
Hey I’m running this script and initialise the ClearML task also in this file https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/scripts...
2 years ago
0 Votes
11 Answers
547 Views
0 Votes 11 Answers 547 Views
Any idea why I get this error in all my agents clearml_agent: ERROR: APIError: code 400/707: No queue is tagged as the default queue for this company
2 years ago
0 Votes
4 Answers
641 Views
0 Votes 4 Answers 641 Views
Hi, are there any plans or already ways to deploy a pipeline with clearml-serving to triton? I would also be interested in the support of deploying pure pyth...
2 years ago
0 Votes
11 Answers
551 Views
0 Votes 11 Answers 551 Views
Hey, is there a way to limit the number of tasks run at the same time by an agent in service mode?
2 years ago
0 Votes
7 Answers
559 Views
0 Votes 7 Answers 559 Views
Hi, how can I use package_manager.force_repo_requirements_txt=true in a mono repository structure? like repo/project-a/requirements.txt , repo/project-b/requ...
2 years ago
0 Votes
30 Answers
517 Views
0 Votes 30 Answers 517 Views
Hi all, I have an Elasticsearch problem on my ClearML server. The error message I get on the ClearML webapp is General data error (TransportError(503, 'searc...
2 years ago
0 Votes
2 Answers
601 Views
0 Votes 2 Answers 601 Views
2 years ago
0 Votes
2 Answers
505 Views
0 Votes 2 Answers 505 Views
2 years ago
0 Votes
9 Answers
562 Views
0 Votes 9 Answers 562 Views
Hey, I’m getting the following error when loading a model using model.get_local_copy() … raise ValueError("Could not retrieve a local copy of model weights {...
2 years ago
0 Votes
2 Answers
630 Views
0 Votes 2 Answers 630 Views
Hey, I'm trying to get the Google Cloud Platform Credentials as a .json file inside my dockerized clearML agents. I was able to copy those credentials from t...
2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

using this code in https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/scripts/pretrain.py
` if name == "main":

task = Task.init(project_name="dummy",
             task_name="pretraining",
             task_type=Task.TaskTypes.training,
             reuse_last_task_id=False)

task.connect(args)
print('Arguments: {}'.format(args))

# only create the task, we will actually execute it later
task.execute_remotely()

spawn_dist.run...
2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

I'm running now the the code shown above and will let you know if there is still an issue

2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

` if name == "main":

task = Task.init(project_name="dummy",
             task_name="pretraining",
             task_type=Task.TaskTypes.training,
             reuse_last_task_id=False)

task.connect(args)
print('Arguments: {}'.format(args))

# only create the task, we will actually execute it later
task.execute_remotely()

spawn_dist.run(args) `I added it to this script and use it as a starting point   https://github.com/facebookresearch/fastMRI/bl...
2 years ago
0 Hey All. Quick Question About The

When using clearml-agent daemon --queue default --docker it is running. In this case I always had some issues when adding the --gpu flag.

2 years ago
0 Hey All. Quick Question About The

` 2021-05-06 13:46:34.032391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:

pciBusID: 0000:a1:00.0 name: NVIDIA Quadro RTX 8000 computeCapability: 7.5

coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 47.46GiB deviceMemoryBandwidth: 625.94GiB/s

2021-05-06 13:46:34.032496: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: ...

2 years ago
0 Hey All. Quick Question About The

tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64

2 years ago
0 Hey All. Quick Question About The

yes, this one is running in venv and not docker, because I had some issues with cuda and docker. The virtualenv==20.4.6 in the requirements.txt. I think it broke after installing  clearml-serving   in the same env.

2 years ago
0 Hey All. Quick Question About The

clearml-agent daemon --gpus 0 --queue default --docker nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu18.0 causes not using the GPUs because of missing libs.

2 years ago
0 Hi, I Want To Pass Environment Variables From The Host To The Docker Containers Running My Task. I Managed To Use

I like this approach more but it still requires resolved environment variables inside the clearml.conf

2 years ago
0 Hi, How Can I Use

I will try it 🙂

2 years ago
0 Hi, Our Server Ip Address Has Changed, And This Breaks All The Paths To Artifacts / Datasets. Is There A Way To Fix The Old Paths So That They Can Be Accessed Again? Thank You!

SuccessfulKoala55 Hey, for us artifact download urls, model download urls, images in plots and debug image urls are broken. In the linked example I can see a solution for the debug images and potentially plot images but cant find the artifacts and model urls inside ES. Are those urls maybe stored inside the mongodb? Any idea where to find them?

2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

Hey AgitatedDove14 , I fixed my code issue and are now able to train on multiple gpus using the https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/fastmri/spawn_dist.py . Since I create the ClearML Task in the main thread I now can't see any training plots and probably also not the output model. What would be the right approach? I would like to avoid using Task.current_task().upload_artifact() or manual logging. I really enjoy the automatic detection

2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

My code produces now an error inside one of the threads, but that should be an issue on my side. Still this issue inside a child thread was not detected as failure and the training task resulted in "completed". This error happens now with the Task.init inside the if __name__ == "__main__": as seen above in the code snippet.

2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

This happens inside the agent, since I use task.execute_remotely() I guess. The agent runs on ubuntu 18.04 and not in docker mode

2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

RuntimeError: stack expects each tensor to be equal size, but got [15, 640, 372, 2] at entry 0 and [15, 322, 640, 2] at entry 1 Detected an exited process, so exiting main terminating child processes exiting

2 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

Actually I saw that the RuntimeError: context has already been set appears when the task is initialised outside if __name__ == "__main__":

2 years ago
0 Hey All. Quick Question About The

` Process failed, exit code 1task ab1a90dacb9042eea8e4a6a16640d7f4 pulled from 8f06b6b160c14a3591d791c1885b309e by worker test:gpu1
Running task 'ab1a90dacb9042eea8e4a6a16640d7f4'
Storing stdout and stderr log to '/tmp/.clearml_agent_out.kbkz1n40.txt', '/tmp/.clearml_agent_out.kbkz1n40.txt'
Current configuration (clearml_agent v1.0.0, location: /tmp/.clearml_agent.3e6l7juj.cfg):

sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes ...

2 years ago
0 Hey All. Quick Question About The

AgitatedDove14 I created a new clean venv and freshly installed the clearml-agent under python / pip 3.8 and now it is working again. Still don't know what caused this issue. Thank you very much for helping!

2 years ago
0 Hey All. Quick Question About The

docker run --gpus device=0 --rm -it nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu18.04 bash worked, I could run in it nvidia-smi and see gpu 0

2 years ago
0 Hey All. Quick Question About The

the error your are citing happens when running clearml-agent daemon --gpus 0 --queue default --docker nvidia/cuda

2 years ago
2 years ago
0 Hey All. Quick Question About The

Hi AgitatedDove14 , I get an error when running a task on my worker. I have looked into /home/user/.clearml/venvs-builds but it is empty. Any idea why this happens? I actually don’t know what I changed to cause this issue… I’m running clearml-agent v1.0.0

clearml_agent: ERROR: Command '['python3.6', '-m', 'virtualenv', '/home/user/.clearml/venvs-builds/3.6']' returned non-zero exit status 1.

2 years ago
0 Hey All. Quick Question About The

One more thing: The dockerized version is still not working as I want it to. If I use any specific docker image like docker: nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu18.04 on a host machine with NVIDIA-SMI 465.19.01  Driver Version: 465.19.01  CUDA Version: 11.3 I always get a similar error as above where a lib is missing. If I use the example from http://clear.ml clearml-agent daemon --gpus 0 --queue default --docker nvidia/cuda I always get this error ` docker: Error...

2 years ago
0 Hi, Our Server Ip Address Has Changed, And This Breaks All The Paths To Artifacts / Datasets. Is There A Way To Fix The Old Paths So That They Can Be Accessed Again? Thank You!

I think Anna means that if artifacts and models are stored on the clearml fileserver their path will contain the IP or domain of the fileserver. If you then move the fileserver to a different host, all the urls are broken since the host changed.

2 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

I increased already the memory to 8GB after reading similar issues here on the slack`

Just making sure, how exactly did you do that?

docker-compose down
elasticsearch: networks: - backend container_name: clearml-elastic environment: ES_JAVA_OPTS: -Xms8g -Xmx8g `` docker-compose up -d

2 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Solving the replica issue now allowed me to get better insights into why the one index is red.
` {
"index" : "events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2021-11-09T22:30:47.018Z",
"last_allocation_status" : "no_valid_shard_copy"
},
"can_allocate" : "no_valid_shard_copy",
"allocate_explanation" : "cannot allocate because a...

2 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Did you wait for all the other indices to reach yellow status?

yes I waited until everything was yellow

2 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

using top inside the elasticsearch container shows elastic+ 20  0  17.0g  8.7g 187584 S  2.3 27.2  1:09.18 java that the 8g are reserved. So setting ES_JAVA_OPTS: -Xms8g -Xmx8g should work.

2 years ago
Show more results compactanswers