AppetizingMouse58

0 Questions, 132 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Answers 132

0 Hi! I'M Having Some Problems, Could You Help Me? I Have Been Working With Version 0.15.0 Of Trains-Server For A Month, But Yesterday I Stopped Accessing Logs. When I Tried To Go To The Project /Task/Results/Scalars, I Got The Error: "Error 100 : General D

IdealPanda97 Ok, I see. Can you please run the following command, then restart the docker-compose and see if it makes any difference?
sudo chown -R 1000:1000 /opt/trains

4 years ago

0 Quick Question: Why Does Clearml-Server 1.15.0 Api-Server Python Package Require Es 8.12.0 But The Docker-Compose References Es 7.17.18?

@<1523701066867150848:profile|JitteryCoyote63> The requirements list the client library that apiserver uses to access the Elasticsearch. This library is capable of working with both Elasticsearch 7 and 8

12 months ago

0 Hi! I Have Problem With Login To Trains. We Have Created Users That Until Yesterday Have No Problem To Access App, But Now It Throws Invalid User/Password Combination For Everyone. I Have Checked Apiserver Configuration And Everything Looks Ok. Do You Kno

Oh, I see. Then maybe we can see some more info in the browser dev tools

4 years ago

0 I Keep Getting Errors When Trying To Compare A Lot Of Experiments At The Same Time (>10). What'S Evern Worse Is That Trains Start Working Much Slower In General After These Attempts, The Only Way To Fix It Is To Restart The Whole Thing. Would Getting Bett

Great! What error do you still see in UI when comparing more than 20 experiments? At the time of error do you see any error response from the apiserver (in the browser network tab)? When the call to compare of 20+ task metrics succeed how much time does it usually takes in your environment?

4 years ago

0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

The data that you sent looks fine. It seems that you actually has these iterations in Elasticsearch. To check whether it is the case please run the following command in the shell on your host. You should get the first 10 task events with the smallest iterations:
curl -XGET -H "Content-Type: application/json" localhost:9200/events-training_stats_scalar*/_search?pretty -d' { "query": { "term": {"task": "d45ecb5ad7084175bd83dd39777b10c5"} }, "sort": {"iter": "asc"} }'

2 years ago

0 Hi Everyone! I'M Using Minios3 As A File Server And As Default Output Uri. I'Ve Faced The Following Problem. When I Delete Tasks From Web Ui (And Also From Archive) Their Artifacts Didn'T Get Deleted From S3. I'M Using Self Hosted Clearml==1.11. What Shou

Ok, so there is no mapping for the whole config folder or specific config file that you changed. That's why async_delete does not get your updated configuration. You can do one of the following: either add here mapping for the specific file like you did earlier or map the whole config folder like apiserver service does:

/opt/clearml/config:/opt/clearml/config
The second way is probably more flexible

one year ago

0 What Is The Current State Of Deleting Debug Samples? I Use S3/Minio As My Fileserver. If I Delete Tasks From The Ui, Are Debug Samples Deleted On S3? If I Run The Cleanup Service Script, Does It Debug Samples On S3?

Hi @<1523701868901961728:profile|ReassuredTiger98> , how exactly do you override the values in storage_credentials file? Do you prepare a new docker image with the changed file or map this file from outside with the volume mapping in the docker compose or through the env variables? What is also important is that you do this override for the async_delete service. It is the service that actually uses the storage credentials. Not the apiserver itself

one year ago

0 Greetings! Could You Help Me? I’Ve Just Tried Delete Old Experiment (Year Ago) But Got The Following Error:

Hi ResponsiveCamel97 , the shards and indices stats look fine. Can you please try the async delete of the task data? You can run the following line in the shell inside the apiserver container. Just replace <task_id> with your actual task id
curl -XPOST -H "Content-Type: application/json" " " -d'{"query": {"term": {"task": "<task_id>"}}}'You should get in response something like this:
{"task":"p6350SG7STmQALxH-E3CLg:1426125"}Then you can periodically ping ES on the status of the r...

2 years ago

@<1585078752969232384:profile|FantasticDuck7> What volume mappings do you have for the async_delete service in the docker-compose.yaml file?

one year ago

0 Hello! We Are Trying To Upgrade From Trains Server 15.1 To 16.1 Using Docker, But Are Running Into A Permission Error:

If you run the following command 'sudo chown -R 1000:1000 /opt/trains' does it change anything?

4 years ago

0 Hi All, I’M Running Experiments Using Clearml. The Training Is Very Slow, And I’M Getting The Following Errors And Warnings:

Are you running them on the computer that hosts the server docker containers. What is the port binding for elasticsearch in your docker compose?

2 years ago

🙂

one year ago

@<1585078752969232384:profile|FantasticDuck7> The best would be to copy this file to the host, edit it and map this file into the container instead of the original one. The single file mapping in the docker-compose file should look like this:

    volumes:
      - type: bind
        source: <the path to the config file on the host>
        target: /opt/clearml/apiserver/config/default/services/storage_credentials.conf

You should do it for the async_delete service. Not for the apise...

one year ago

0 Got Some Errors While Running Migration Script From Es5 To Es7:

Hi H4dr1en, there is a chance that the problem is that in parallel reindexing of data. You can try to replace parallel=max(docker_resources.cpus // 2, 1)
at line 190 with
parallel=1
I think you will need to remove the /opt/trains/data/elastic_7 folder before script restart

4 years ago

0 Hey All, I'M Running A Self Hosted K8S Cluster With Clearml Server Installed Using Helm Chart Clearml-7.2.0, And Saving My Artifacts In Self Hosted S3 Bucket. I'M Able To Upload My Artifacts Just Fine, But I Want To Be Able To Delete Those Artifacts When

Hi @<1673863788857659392:profile|HomelyRabbit25> , yes it should include the support for async_delete service. Please provide the storage_credentials configuration to this service instead of the apiserver. For the details of whether the deletion works or it has any issues with the provided configuration please inspect the logs from the async_delete pod.

one year ago

0 Hi, I Successfully Upgraded Trains To Clearml, But Now I Don'T See The Projects That I Had When I Was Using Trains. The Upgrade Log Is Attached. I Am Using My Own Server In My Host.

Can you please run the following in the command line of the hosting server and share the results?
curl -XGET

2 years ago

Hi DilapidatedDucks58 , I am trying to reproduce the "Connection is full warning". Do you override any apiserver environment variables is docker compose? If yes then can you share your version of docker-compose? Do you provide a configuration file for gunicorn? Can you please share it?

4 years ago

0 Roll Call! Who Else Is Here?

🕶️

3 years ago

0 Got Some Errors While Running Migration Script From Es5 To Es7:

Enjoy the new version:) Would still be interesting to see what caused ES7 to stop responding.

4 years ago

0 Hi All, I Am Creating Sub Project, For Experiment, But It Seems There Is

Hi QuaintJellyfish58 in the latest data that you sent I see only the responses (some of them are marked as payloads but they are actually responses). What would be very interesting is to see the requests (payloads) that resulted in the following empty responses:
` # response
{"meta":{"id":"aaaffe49ace64f1a8b0211925afcfd32","trx":"aaaffe49ace64f1a8b0211925afcfd32","endpoint":{"name":"projects.get_all_ex","requested_version":"2.20","actual_version":"1.0"},"result_code":200,"result_subcode":0,...

2 years ago

0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

This one is indeed dynamic but can be set as follows: "plot_len":{"type":"long"}

3 years ago

0 Trying To Enqueue A Task Through The Ui, Getting This Error - What Could It Be? (Running On Aws, On The Official Trains Ami)

Hi Elior, chances are that you do not have enough space for Elasticsearch on your storage. Please check the ES logs and increase the available disk space.

4 years ago

0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Hi JitteryCoyote63 , are you still missing a month of data in the event logs? If you do cat indices do you see the same amount of docs in the original and the new ones?

3 years ago

0 Hi! Our Clearml Server Keeps Crashing Because Of Some Weird Task With The

Hi @<1523707653782507520:profile|MelancholyElk85> , what version of the apiserver are you using?

one year ago

0 Got Some Errors While Running Migration Script From Es5 To Es7:

Yes, correct.

4 years ago

0 Hi! How Can I Delete Dataset From Ui And From S3 Bucket? I Tried To Delete From Ui And Then Checking But I Still Have It ... Api Client Doesn'T Have Methods To Work With Datasets ...

Hi @<1523701457835003904:profile|AbruptHedgehog21> can you please share the logs from the async_delete service? It is responsible for the actual deletion of the data

one year ago

0 Hi Everyone, I Have Two Issues With New Clearml-Server (1.14):

Hi @<1523701260895653888:profile|QuaintJellyfish58> , we are in the final stages of preparing the hotfix version open-v1.14.1. It should be released this week

one year ago

0 Hi, I'M Getting This Long Error When Running

Hi SubstantialElk6 , another thing that can be checked is the health of the particular ES indices. Can you please run the below command in the clearml-elastic container and post the results here?
curl -XGET

3 years ago

0 Hi All, I Like To Upgrade

No, there was a problem with the particular version migration. The temporary index creation allowed to this and all subsequent migrations to run successfully. So for now your DB is properly aligned with the latest ClearML and the future upgrades should work fine.

4 years ago

0 Hi Everyone, I Am Just Wondering Whether The Bugs Regarding The Deletion Of Tasks Is Fixed In The Current Version? E.G. This Happening When You Want To Delete A Lot Of Tasks.

curl -X PUT localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d'{"persistent" : {"search.max_open_scroll_context": 1000}}'

one year ago

Show more results