Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AppetizingMouse58
Moderator
0 Questions, 132 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0
0 Greetings! Could You Help Me? I’Ve Just Tried Delete Old Experiment (Year Ago) But Got The Following Error:

With what memory setting do you run ES? How much memory and cpu is currently occupied by ES container?

one year ago
0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

The data that you sent looks fine. It seems that you actually has these iterations in Elasticsearch. To check whether it is the case please run the following command in the shell on your host. You should get the first 10 task events with the smallest iterations:
curl -XGET -H "Content-Type: application/json" localhost:9200/events-training_stats_scalar*/_search?pretty -d' { "query": { "term": {"task": "d45ecb5ad7084175bd83dd39777b10c5"} }, "sort": {"iter": "asc"} }'

one year ago
0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

MassiveHippopotamus56 The data that you posted from the browser developers tool seems coming from the "Headers" tab. Can you please post the data from the "Payload" and "Response" tabs. This is in case you run in Chrome. In other browsers the tabs may have different names

one year ago
0 Hi All, I Have A

Hi CooperativeFox72 , how much free space do you have on your disk now? If you run du on your /opt/trains/data/elastic_7 folder in let's say 5 mins intervals do you see the folder size is growing?

3 years ago
0 I Keep Getting Errors When Trying To Compare A Lot Of Experiments At The Same Time (>10). What'S Evern Worse Is That Trains Start Working Much Slower In General After These Attempts, The Only Way To Fix It Is To Restart The Whole Thing. Would Getting Bett

Great! What error do you still see in UI when comparing more than 20 experiments? At the time of error do you see any error response from the apiserver (in the browser network tab)? When the call to compare of 20+ task metrics succeed how much time does it usually takes in your environment?

3 years ago
0 I Keep Getting Errors When Trying To Compare A Lot Of Experiments At The Same Time (>10). What'S Evern Worse Is That Trains Start Working Much Slower In General After These Attempts, The Only Way To Fix It Is To Restart The Whole Thing. Would Getting Bett

Hi DilapidatedDucks58 , I am trying to reproduce the "Connection is full warning". Do you override any apiserver environment variables is docker compose? If yes then can you share your version of docker-compose? Do you provide a configuration file for gunicorn? Can you please share it?

3 years ago
0 Hi, We Have A Self Hosted Clearml Server Which I Mainly Use For Experiment Tracking. There Is One Issue I Have Noticed Recently: Whenever I Archive And Delete An Experiment (With The Box " Remove All Related Artifacts And Debug Samples From Clearml File

The 1.10 version handles files deletion differently so there is chance that it fixes the issue. If you use the default apiserver port then I would try upgrading. If you override the apiserver port then please wait for the hotfix version 1.10.1 that should be released soon

one year ago
0 Hi! Our Clearml Server Keeps Crashing Because Of Some Weird Task With The

Hi @<1523707653782507520:profile|MelancholyElk85> , what version of the apiserver are you using?

3 months ago
0 Hi! Our Clearml Server Keeps Crashing Because Of Some Weird Task With The

We found the issue. It will be fixed in the upcoming patch for the open-v1.14 release

3 months ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

SubstantialBaldeagle49 This is fine. When you start docker-compose it takes different time for the services to start. Apiserver waits for the Elasticsearch to start and proceeds once it is ready. Can you reproduce the buckets issue and share the apiserver log that contains it?

3 years ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

SubstantialBaldeagle49 The log looks OK. Where do you see the error?

3 years ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

The index "events-plot-d1bd92a3b039400cbafc60a7a5b1e52b" is red meaning that it is corrupted and elastic cannot work with it. The most straightforward solution would be to delete this index but it will result in all the plots generated so far will be lost.

3 years ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

I am not sure about the reasons. What you can do is to backup your /opt/trains/data folder periodically (preferably stopping the docker compose before it). Another possibility is to configure your elasticsearch to run as a cluster with 2 or more nodes on the same or different machine. This will allow elastic to replicate your indices to other nodes.

3 years ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

What can be seen in the logs is that for some reason Elasticsearch had internal failure when trying to perform the plots query. I will send you the instruction on how to check for the health of ES nodes. It may provide us with some clues

3 years ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

SubstantialBaldeagle49 Well, I see. Elaticsearch does not support putting that large number into max_buckets. From the error message that I see in the apiserver log I am not sure that the original problem is connected to the buckets number. Can you please revert the max_bucket change, reproduce the original problem and share the elasticsearch log?

3 years ago
0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

SubstantialBaldeagle49 This should collect the logs: 'sudo docker logs trains-apiserver >& apiserver.logs'

3 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Yes, it is safe to put number_of_replicas to 0 and refresh_interval to -1 for the target index before the reindex and then put them back after the reindex is finished

2 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Hi JitteryCoyote63 , are you still missing a month of data in the event logs? If you do cat indices do you see the same amount of docs in the original and the new ones?

2 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

We can compare with the table that you sent yesterday. Unless a lot of new events were written since then

2 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

This one is indeed dynamic but can be set as follows: "plot_len":{"type":"long"}

2 years ago
0 Greetings! Could You Help Me? I’Ve Just Tried Delete Old Experiment (Year Ago) But Got The Following Error:

The tasks themselves will stay until you succeed to delete them from the client. Here we tried to see why deleting their data from ES timed out. From what I see no data was actually deleted (most likely because of the previous delete efforts that actually deleted the data though caused time out in the apiserver). What seems problematic is the amount of time that each operation took (19 and 16 seconds). It may be due to insufficient memory/cpu allocation to ES container or due to the 50Gb inde...

one year ago
0 What Is The Current State Of Deleting Debug Samples? I Use S3/Minio As My Fileserver. If I Delete Tasks From The Ui, Are Debug Samples Deleted On S3? If I Run The Cleanup Service Script, Does It Debug Samples On S3?

Hi @<1523701868901961728:profile|ReassuredTiger98> , how exactly do you override the values in storage_credentials file? Do you prepare a new docker image with the changed file or map this file from outside with the volume mapping in the docker compose or through the env variables? What is also important is that you do this override for the async_delete service. It is the service that actually uses the storage credentials. Not the apiserver itself

6 months ago
0 Hi, I Successfully Upgraded Trains To Clearml, But Now I Don'T See The Projects That I Had When I Was Using Trains. The Upgrade Log Is Attached. I Am Using My Own Server In My Host.

Hi RotundSquirrel78 , can you please check that your docker compose file has the correct volume mapping for elasticsearch service? From the output of the upgrade script I assume it should be from /home/orpat/trains/data/elastic_7 into /usr/share/elasticsearch/data

one year ago
one year ago
Show more results compactanswers