AppetizingMouse58

0 Questions, 132 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Answers 132

0 Hi, We Have A Self Hosted Clearml Server Which I Mainly Use For Experiment Tracking. There Is One Issue I Have Noticed Recently: Whenever I Archive And Delete An Experiment (With The Box " Remove All Related Artifacts And Debug Samples From Clearml File

As long as you delete only from the deleted tasks folders it should be OK

2 years ago

0 We'Re Running Into Errors Such As This:

Hi UnevenDolphin73 . how many artifacts do you have on this task? We are storing task metadata in Mongo and there is a limit of 16Mb per a single document. While the artifact itself is not stored under the task there is some metadata (notably the uri and display_data/preview) that is stored for each artifact

2 years ago

0 Hi There. I Just Deployed Clearml Server To Aws Ec2 Instance. After Deployment Following Instructions, I Got Below Errors From

Hi @<1686547380465307648:profile|StrongSeaturtle89> , please put the following setting in the docker-compose.yaml under elasticsearch->environment:

ingest.geoip.downloader.enabled: false

And then restart the docker compose. Does it help?

one year ago

The 1.10 version handles files deletion differently so there is chance that it fixes the issue. If you use the default apiserver port then I would try upgrading. If you override the apiserver port then please wait for the hotfix version 1.10.1 that should be released soon

2 years ago

0 Hi, I Successfully Upgraded Trains To Clearml, But Now I Don'T See The Projects That I Had When I Was Using Trains. The Upgrade Log Is Attached. I Am Using My Own Server In My Host.

Hi RotundSquirrel78 , can you please check that your docker compose file has the correct volume mapping for elasticsearch service? From the output of the upgrade script I assume it should be from /home/orpat/trains/data/elastic_7 into /usr/share/elasticsearch/data

2 years ago

0 Hi, I Successfully Upgraded Trains To Clearml, But Now I Don'T See The Projects That I Had When I Was Using Trains. The Upgrade Log Is Attached. I Am Using My Own Server In My Host.

Ok, I see. And if you run a new experiment in the new version do you see its logs?

2 years ago

0 Hi, I Am Having Problem With Clearml Running On Our Private Server. This Error Occured On Older Version On Clearml And Server. Now After Update And Purge Of All Old Database With

The index events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b status is red. Meaning that the data for this index got corrupted. Since there are no replicas the only feasible option would be to delete this index. All the training scalars events for the old taskd would be lost then. But the newly created tasks should start working fine.
curl -XDELETE

2 years ago

0 Hi, I Successfully Upgraded Trains To Clearml, But Now I Don'T See The Projects That I Had When I Was Using Trains. The Upgrade Log Is Attached. I Am Using My Own Server In My Host.

According to the sizes the data is there and ES sees it.

2 years ago

0 Hi All, I Like To Upgrade

🙂

4 years ago

0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

Hi MassiveHippopotamus56
Can you please open the browser developer tools, navigate to scalar tabs for one of the experiments that show wrong iteration and copy here the request payload and response for the events.scala_metrics_iter_histogram call?

2 years ago

0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

Please run these commands and see if you have any "red" statuses in the output:
curl " http://localhost:9200/_cluster/health?pretty "
curl " http://localhost:9200/_cluster/health?level=indices&pretty "

4 years ago

0 Hi, I Am Having Problem With Clearml Running On Our Private Server. This Error Occured On Older Version On Clearml And Server. Now After Update And Purge Of All Old Database With

Ok, I see. Then you can enter the apiserver container:
sudo docker exec -it clearml-apiserver /bin/bashAnd run the following commands inside the container
curl -XGET curl -XGET

2 years ago

0 Hi! Our Clearml Server Keeps Crashing Because Of Some Weird Task With The

Hi @<1523707653782507520:profile|MelancholyElk85> , what version of the apiserver are you using?

one year ago

0 Hi! Our Clearml Server Keeps Crashing Because Of Some Weird Task With The

We found the issue. It will be fixed in the upcoming patch for the open-v1.14 release

one year ago

0 Hi Guys, I Keep Receiving A Timeout Error:

Hi VexedPeacock35 , I suspect that Elasticsearch works too hard and periodically misses timeouts on recording events. How much memory and CPU is it using? Can you increase the memory that is allocated to it and see whether this helps?

2 years ago

0 Hello! We Are Trying To Upgrade From Trains Server 15.1 To 16.1 Using Docker, But Are Running Into A Permission Error:

If you run the following command 'sudo chown -R 1000:1000 /opt/trains' does it change anything?

4 years ago

0 Hi Everyone, I Am Just Wondering Whether The Bugs Regarding The Deletion Of Tasks Is Fixed In The Current Version? E.G. This Happening When You Want To Delete A Lot Of Tasks.

Hi @<1523701868901961728:profile|ReassuredTiger98> , what version of the apiserver are you using?

one year ago

0 Hi All, I Am Creating Sub Project, For Experiment, But It Seems There Is

Hi QuaintJellyfish58 in the latest data that you sent I see only the responses (some of them are marked as payloads but they are actually responses). What would be very interesting is to see the requests (payloads) that resulted in the following empty responses:
` # response
{"meta":{"id":"aaaffe49ace64f1a8b0211925afcfd32","trx":"aaaffe49ace64f1a8b0211925afcfd32","endpoint":{"name":"projects.get_all_ex","requested_version":"2.20","actual_version":"1.0"},"result_code":200,"result_subcode":0,...

2 years ago

0 Hi All, I Am Creating Sub Project, For Experiment, But It Seems There Is

Thanks, I think that I see the problem,

2 years ago

0 Hi All, I Have A

Hi CooperativeFox72 , how much free space do you have on your disk now? If you run du on your /opt/trains/data/elastic_7 folder in let's say 5 mins intervals do you see the folder size is growing?

4 years ago

0 Hi All, I Am Creating Sub Project, For Experiment, But It Seems There Is

Hi QuaintJellyfish58 , thanks for the feedback. I am trying to compare what you send and receive for team's view with what you get in My-work view. Can you please also send the data for the same requests and responses in the My work view structured in the same way like you sent for the team view now?

2 years ago

Hi @<1558986867771183104:profile|ShakyKangaroo32> , can you please share the logs from the async_delete docker container?

2 years ago

0 Hi Everyone, I Am Just Wondering Whether The Bugs Regarding The Deletion Of Tasks Is Fixed In The Current Version? E.G. This Happening When You Want To Delete A Lot Of Tasks.

@<1523701868901961728:profile|ReassuredTiger98> Strange:( in 1.10 we already had the code for clearing ES scrolls created during the task deletion. I would recommend upgrading to the latest release v1.12.1 anyway. In addition you can instruct ES to allow more open scrolls like below. By default it is limited to 500.

one year ago

0 Hi All, I Like To Upgrade

No, there was a problem with the particular version migration. The temporary index creation allowed to this and all subsequent migrations to run successfully. So for now your DB is properly aligned with the latest ClearML and the future upgrades should work fine.

4 years ago

0 Roll Call! Who Else Is Here?

🕶️

4 years ago

0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

The index "events-plot-d1bd92a3b039400cbafc60a7a5b1e52b" is red meaning that it is corrupted and elastic cannot work with it. The most straightforward solution would be to delete this index but it will result in all the plots generated so far will be lost.

4 years ago

0 Trying To Enqueue A Task Through The Ui, Getting This Error - What Could It Be? (Running On Aws, On The Official Trains Ami)

Hi Elior, chances are that you do not have enough space for Elasticsearch on your storage. Please check the ES logs and increase the available disk space.

4 years ago

0 Hi! I Have Problem With Login To Trains. We Have Created Users That Until Yesterday Have No Problem To Access App, But Now It Throws Invalid User/Password Combination For Everyone. I Have Checked Apiserver Configuration And Everything Looks Ok. Do You Kno

Hi ImmenseMole52 , did you do any changes in the docker compose file? If yes, then can you please send your version of the file?

4 years ago

Can you try deleting the application cookie? While being on the trains page in the browser devtools you navigate to Application->Cookies and under it delete any trains cookies that are there. I believe that you will need to login after that

4 years ago

0 Hi, I Successfully Upgraded Trains To Clearml, But Now I Don'T See The Projects That I Had When I Was Using Trains. The Upgrade Log Is Attached. I Am Using My Own Server In My Host.

Yes exactly, can you please verify that you use /home/orpat/trains/data/elastic_7 in the docker compose of 1.5?

2 years ago

Show more results