AppetizingMouse58

0 Questions, 132 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Answers 132

0 I Keep Getting Errors When Trying To Compare A Lot Of Experiments At The Same Time (>10). What'S Evern Worse Is That Trains Start Working Much Slower In General After These Attempts, The Only Way To Fix It Is To Restart The Whole Thing. Would Getting Bett

Do you see any error in the browser network tab?

4 years ago

0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Yes, it is safe to put number_of_replicas to 0 and refresh_interval to -1 for the target index before the reindex and then put them back after the reindex is finished

3 years ago

0 Hi Everyone, I Am Just Wondering Whether The Bugs Regarding The Deletion Of Tasks Is Fixed In The Current Version? E.G. This Happening When You Want To Delete A Lot Of Tasks.

Hi @<1523701868901961728:profile|ReassuredTiger98> , what version of the apiserver are you using?

one year ago

0 Hi! I'M Having Some Problems, Could You Help Me? I Have Been Working With Version 0.15.0 Of Trains-Server For A Month, But Yesterday I Stopped Accessing Logs. When I Tried To Go To The Project /Task/Results/Scalars, I Got The Error: "Error 100 : General D

IdealPanda97 Ok, I see. Can you please run the following command, then restart the docker-compose and see if it makes any difference?
sudo chown -R 1000:1000 /opt/trains

4 years ago

0 Hello! We Are Trying To Upgrade From Trains Server 15.1 To 16.1 Using Docker, But Are Running Into A Permission Error:

Can you run 'ls -al' in the /opt/trains/data folder and also in the /opt/trains/data/elastic_7 folder and send the output?

4 years ago

0 Hi Everyone, I Am Just Wondering Whether The Bugs Regarding The Deletion Of Tasks Is Fixed In The Current Version? E.G. This Happening When You Want To Delete A Lot Of Tasks.

@<1523701868901961728:profile|ReassuredTiger98> Strange:( in 1.10 we already had the code for clearing ES scrolls created during the task deletion. I would recommend upgrading to the latest release v1.12.1 anyway. In addition you can instruct ES to allow more open scrolls like below. By default it is limited to 500.

one year ago

0 Hello! We Are Trying To Upgrade From Trains Server 15.1 To 16.1 Using Docker, But Are Running Into A Permission Error:

If you run the following command 'sudo chown -R 1000:1000 /opt/trains' does it change anything?

4 years ago

0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Hi JitteryCoyote63 , are you still missing a month of data in the event logs? If you do cat indices do you see the same amount of docs in the original and the new ones?

3 years ago

Can you share all the error info that you get in the network tab?

4 years ago

Hi DilapidatedDucks58 , I am trying to reproduce the "Connection is full warning". Do you override any apiserver environment variables is docker compose? If yes then can you share your version of docker-compose? Do you provide a configuration file for gunicorn? Can you please share it?

4 years ago

0 Hi, I Have A Problem After Updating Clearml-Server To The Most Recent Version. Elasticsearch Has Been Updated From

This explains the issue I think. The recovery path would be as follows:
Put down the running containers Restore both mongo and elastic data from the backup Run the old version docker containers and make sure that all the data is there Put down the containers Run the upgrade script Start the new version

2 years ago

0 Hi, I Have A Problem After Updating Clearml-Server To The Most Recent Version. Elasticsearch Has Been Updated From

At some point we switched from Mongo DB v3.6 to v4.4. Upgrading from old versions require a migration of mongo data. Did you run the upgrade script as described below? Were there any errors?
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_mongo44_migration/

2 years ago

0 Hi, I'M Getting This Long Error When Running

SubstantialElk6 Both indices that are red are not critical for the ClearML functioning and can be deleted like this:
curl -XDELETE ' ' curl -XDELETE ' 'For the analysis of the possible reasons that lead to it can you please collect the full ES logs to the file and send it here?
sudo docker logs clearml-elastic > log.txt 2>&1

3 years ago

0 Hi, I Have A Problem After Updating Clearml-Server To The Most Recent Version. Elasticsearch Has Been Updated From

Hi SoggyBeetle95 , from what version of clearml did you upgrade? About the tasks that disappeared: you do not see these tasks at all or you see these tasks with no results?

2 years ago

0 Hi, I'M Getting This Long Error When Running

Hi SubstantialElk6 , another thing that can be checked is the health of the particular ES indices. Can you please run the below command in the clearml-elastic container and post the results here?
curl -XGET

3 years ago

0 Hi Everyone! I'M Using Minios3 As A File Server And As Default Output Uri. I'Ve Faced The Following Problem. When I Delete Tasks From Web Ui (And Also From Archive) Their Artifacts Didn'T Get Deleted From S3. I'M Using Self Hosted Clearml==1.11. What Shou

@<1585078752969232384:profile|FantasticDuck7> The best would be to copy this file to the host, edit it and map this file into the container instead of the original one. The single file mapping in the docker-compose file should look like this:

    volumes:
      - type: bind
        source: <the path to the config file on the host>
        target: /opt/clearml/apiserver/config/default/services/storage_credentials.conf

You should do it for the async_delete service. Not for the apise...

one year ago

🙂

one year ago

@<1585078752969232384:profile|FantasticDuck7> What volume mappings do you have for the async_delete service in the docker-compose.yaml file?

one year ago

Ok, so there is no mapping for the whole config folder or specific config file that you changed. That's why async_delete does not get your updated configuration. You can do one of the following: either add here mapping for the specific file like you did earlier or map the whole config folder like apiserver service does:

/opt/clearml/config:/opt/clearml/config
The second way is probably more flexible

one year ago

0 Hi Guys, I Keep Receiving A Timeout Error:

Hi VexedPeacock35 , I suspect that Elasticsearch works too hard and periodically misses timeouts on recording events. How much memory and CPU is it using? Can you increase the memory that is allocated to it and see whether this helps?

2 years ago

0 Hi All, I’M Running Experiments Using Clearml. The Training Is Very Slow, And I’M Getting The Following Errors And Warnings:

Actually the task logs will be lost. The tasks themselves and their reported metrics and plots would stay. The command is the following:
curl -XDELETE localhost:9200/events-log-d1bd92a3b039400cbafc60a7a5b1e52b

2 years ago

0 Hi All, I’M Running Experiments Using Clearml. The Training Is Very Slow, And I’M Getting The Following Errors And Warnings:

Are you running them on the computer that hosts the server docker containers. What is the port binding for elasticsearch in your docker compose?

2 years ago

0 Hi! I Have Some Problems With Data Migration Process. My Error Log In The Attached Files.

Hi IdealPanda97 , can you share the logs for the 'elastic-upgrade-7' docker container? According to the upgrade log there was some problem with Elasticsearch during indices copy.

4 years ago

0 Hi! I Have Some Problems With Data Migration Process. My Error Log In The Attached Files.

Yes, the command would be like this: curl -XDELETE " http://localhost:9200/queue_metrics_d1bd92a3b039400cbafc60a7a5b1e52b_2020-08 "
If you decide to delete the "red" indices then you can proceed with the command above issuing it for each problematic index. The queue metrics index is not very important but the second one "events-logs" contains all the log messages produced by your tasks in August. You will still have debug images and scalar metrics reported by these tasks but the log messages ...

4 years ago

0 Hi! I Have Some Problems With Data Migration Process. My Error Log In The Attached Files.

If it returns an OK result then rerun the upgrade process again.

4 years ago

0 Hey There Have The Following Issue After Upgrading Server And Trains To 0.16:

Setting up an elastic cluster requires some devops. You can search for "setup elasticsearch 7 cluster" in the internet and there are some tutorials there. Stopping your docker-compose once in a certain period of time and backing up the /opt/trains/data folder is more straightforward and it would backup also the data that we store in mongodb.

4 years ago

0 Hi! I Have Some Problems With Data Migration Process. My Error Log In The Attached Files.

Here is the thread with solving the same issue: https://allegroai-trains.slack.com/archives/CTK20V944/p1596724607016500

4 years ago

0 Hi All, I’M Running Experiments Using Clearml. The Training Is Very Slow, And I’M Getting The Following Errors And Warnings:

Hi RattyFish27 , it seems that there is some issue with Elasticsearch cluster. Can you please run the following commands on the server and paste here their output?
curl -XGET curl -XGET

2 years ago

0 Hi! I Have Some Problems With Data Migration Process. My Error Log In The Attached Files.

Sorry, I did not write it properly. You need to run the following curl command from the command line:
curl -XPOST ' http://localhost:9200/_xpack/license/start_basic '

4 years ago

0 Hi All, I’M Running Experiments Using Clearml. The Training Is Very Slow, And I’M Getting The Following Errors And Warnings:

It seems that index events-log-d1bd92a3b039400cbafc60a7a5b1e52b got corrupted. In case there are no backups the only choice would be to delete this index from elasticsearch

2 years ago

Show more results