Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
ClumsyElephant70
Moderator
13 Questions, 89 Answers
  Active since 10 January 2023
  Last activity 2 years ago

Reputation

0

Badges 1

70 × Eureka!
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, how can I use package_manager.force_repo_requirements_txt=true in a mono repository structure? like repo/project-a/requirements.txt , repo/project-b/requ...
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi all, I have an Elasticsearch problem on my ClearML server. The error message I get on the ClearML webapp is General data error (TransportError(503, 'searc...
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, I would like to understand how I can set the pip cache location for my agent, I thought that I already had the right setting with docker_internal_mounts....
3 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hey, is there a way to limit the number of tasks run at the same time by an agent in service mode?
3 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Any idea why I get this error in all my agents clearml_agent: ERROR: APIError: code 400/707: No queue is tagged as the default queue for this company
4 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, I want to pass environment variables from the host to the docker containers running my task. I managed to use extra_docker_shell_script: ["export SECRET=...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, are there any plans or already ways to deploy a pipeline with clearml-serving to triton? I would also be interested in the support of deploying pure pyth...
4 years ago
0 Votes
20 Answers
2K Views
0 Votes 20 Answers 2K Views
Hey I’m running this script and initialise the ClearML task also in this file https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/scripts...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, are there other ways to add package_manager.extra_index_urls to my agents besides configuring them through the clearml.conf file?
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hey, I'm trying to get the Google Cloud Platform Credentials as a .json file inside my dockerized clearML agents. I was able to copy those credentials from t...
4 years ago
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
Hey, I’m getting the following error when loading a model using model.get_local_copy() … raise ValueError("Could not retrieve a local copy of model weights {...
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
4 years ago
3 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

 so you say deleting other old indices that I don't need could help?

This did not help, I still have the same issue

3 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

I will try to recover it, but anyway the learning is to fully separate the fileserver and any output location from mongo, redis and elastic. Also maybe it makes sense the improve the ES setup to have replicas

3 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

since it is a single node, I guess it will not possible to recover or partially recover the index right?

3 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Can you send some more comprehensive log - perhaps there are other messages that are related

which logs do you wish?

3 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Try to restart ES and see if it helps

docker-compose down / up does not help

3 years ago
0 Hey, I’M Getting The Following Error When Loading A Model Using Model.Get_Local_Copy()

SuccessfulKoala55 I'm currently inside the docker container to recover the ckpt files. But /root/.clearml/venvs-builds seems to be empty. Any idea where I could then find the ckpt files?

4 years ago
0 Hi, Our Server Ip Address Has Changed, And This Breaks All The Paths To Artifacts / Datasets. Is There A Way To Fix The Old Paths So That They Can Be Accessed Again? Thank You!

SuccessfulKoala55 Hey, for us artifact download urls, model download urls, images in plots and debug image urls are broken. In the linked example I can see a solution for the debug images and potentially plot images but cant find the artifacts and model urls inside ES. Are those urls maybe stored inside the mongodb? Any idea where to find them?

4 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

SuccessfulKoala55 so you say deleting other old indices that I don't need could help?

3 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

Actually I saw that the RuntimeError: context has already been set appears when the task is initialised outside if __name__ == "__main__":

4 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

using this code in https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/scripts/pretrain.py
` if name == "main":

task = Task.init(project_name="dummy",
             task_name="pretraining",
             task_type=Task.TaskTypes.training,
             reuse_last_task_id=False)

task.connect(args)
print('Arguments: {}'.format(args))

# only create the task, we will actually execute it later
task.execute_remotely()

spawn_dist.run...
4 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

My code produces now an error inside one of the threads, but that should be an issue on my side. Still this issue inside a child thread was not detected as failure and the training task resulted in "completed". This error happens now with the Task.init inside the if __name__ == "__main__": as seen above in the code snippet.

4 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

RuntimeError: stack expects each tensor to be equal size, but got [15, 640, 372, 2] at entry 0 and [15, 322, 640, 2] at entry 1 Detected an exited process, so exiting main terminating child processes exiting

4 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

I'm running now the the code shown above and will let you know if there is still an issue

4 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

Solving the replica issue now allowed me to get better insights into why the one index is red.
` {
"index" : "events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2021-11-09T22:30:47.018Z",
"last_allocation_status" : "no_valid_shard_copy"
},
"can_allocate" : "no_valid_shard_copy",
"allocate_explanation" : "cannot allocate because a...

3 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

Hey AgitatedDove14 , I fixed my code issue and are now able to train on multiple gpus using the https://github.com/facebookresearch/fastMRI/blob/master/banding_removal/fastmri/spawn_dist.py . Since I create the ClearML Task in the main thread I now can't see any training plots and probably also not the output model. What would be the right approach? I would like to avoid using Task.current_task().upload_artifact() or manual logging. I really enjoy the automatic detection

4 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

, what version of clearml is your server?

the docker-compose use clearml:latest

3 years ago
0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

` if name == "main":

task = Task.init(project_name="dummy",
             task_name="pretraining",
             task_type=Task.TaskTypes.training,
             reuse_last_task_id=False)

task.connect(args)
print('Arguments: {}'.format(args))

# only create the task, we will actually execute it later
task.execute_remotely()

spawn_dist.run(args) `I added it to this script and use it as a starting point   https://github.com/facebookresearch/fastMRI/bl...
4 years ago
0 Any Idea Why I Get This Error In All My Agents

The strange thing was that my agents where running in the morning but just disappeared in the clearml server ui under workers-and-queues . So I did docker-compose down / up and then I got this error.

4 years ago
0 Any Idea Why I Get This Error In All My Agents

We do have a queue called office and another queue called default, so the agent is not listening for queues that are not defined. Or do I misunderstand something? The server has all queues defined that the agents are using

4 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

ssh into the elasticsearch container identify the id of the index that seem to be broken run /usr/share/elasticsearch/jdk/bin/java -cp lucene-core*.jar -ea:org.apache.lucene… org.apache.lucene.index.CheckIndex /usr/share/elasticsearch/data/nodes/0/indices/your-id/0/index/ -verbose -exorcise This can be dangerous but is the only option if you assume that the data is lost anyway. either running 3. repairs broken segments or it shows as in my case ` No problems were detected with this i...

3 years ago
0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

The agents also share the clearml.conf file which causes some issue with the worker_id/worker_name. They all want to be ubuntu:gpu0. Any idea how I can randomize it? Setting the CLEARML_WORKER_ID env var somehow does not work

3 years ago
0 Hi

SuccessfulKoala55 can you describe how the failure behaviour will look like?

3 years ago
0 Hi All, I Have An Elasticsearch Problem On My Clearml Server. The Error Message I Get On The Clearml Webapp Is

` elasticsearch:
networks:
- backend
container_name: clearml-elastic
environment:
ES_JAVA_OPTS: -Xms8g -Xmx8g
bootstrap.memory_lock: "true"
cluster.name: clearml
cluster.routing.allocation.node_initial_primaries_recoveries: "500"
cluster.routing.allocation.disk.watermark.low: 500mb
cluster.routing.allocation.disk.watermark.high: 500mb
cluster.routing.allocation.disk.watermark.flood_stage: 500mb
discovery.zen.minimum_master_no...

3 years ago
0 Any Idea Why I Get This Error In All My Agents

docker-compose with entrypoint.sh with python3 -m clearml_agent daemon --docker "${CLEARML_AGENT_DEFAULT_BASE_DOCKER:-$TRAINS_AGENT_DEFAULT_BASE_DOCKER}" --force-current-version ${CLEARML_AGENT_EXTRA_ARGS:-$TRAINS_AGENT_EXTRA_ARGS} --queue office

4 years ago
0 Hi, I Want To Pass Environment Variables From The Host To The Docker Containers Running My Task. I Managed To Use

I like this approach more but it still requires resolved environment variables inside the clearml.conf

4 years ago
0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

so now there is the user conflict between the host and the agent inside the container

3 years ago
Show more results compactanswers