Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi All, I Like To Upgrade

Hi all,
I like to upgrade trains-server:0.16.1 to clearml-server:0.17
In the https://github.com/allegroai/clearml-server#upgrading- the process looks the same as regular trains-server upgrade.
but when I look on the paths of trains-server vs clearml-server I can see there aren't the same, for example:
/opt/trains/data VS /opt/clearml/data
Does I need to move the files? if yes, which files?
Maybe you have a docs to move from trains-server to clearml-server ?

Thanks

  
  
Posted 3 years ago
Votes Newest

Answers 31


trying again 🤞

  
  
Posted 3 years ago

image

  
  
Posted 3 years ago

🙂

  
  
Posted 3 years ago

OK, cool, let us know 🙂

  
  
Posted 3 years ago

what it mean? 😁

  
  
Posted 3 years ago

Can you share the apiserver logs? Use docker logs clearml-apiserver

  
  
Posted 3 years ago

OK, yes, that clears it up 🙂

  
  
Posted 3 years ago

Are you sure you previously had 0.16.1? From the log it seems you either had an empty database or that you had a Trains Server <0.14.0

  
  
Posted 3 years ago

Thanks CooperativeFox72 , looking into it

  
  
Posted 3 years ago

Hi SuccessfulKoala55 ,
I down the server:
[ec2-user@ip-172-31-26-41 ~]$ sudo docker-compose -f /opt/clearml/docker-compose.yml down WARNING: The CLEARML_HOST_IP variable is not set. Defaulting to a blank string. WARNING: The CLEARML_AGENT_GIT_USER variable is not set. Defaulting to a blank string. WARNING: The CLEARML_AGENT_GIT_PASS variable is not set. Defaulting to a blank string. Stopping clearml-webserver ... done Stopping clearml-agent-services ... done Stopping clearml-apiserver ... done Stopping clearml-redis ... done Stopping clearml-fileserver ... done Stopping clearml-mongo ... done Stopping clearml-elastic ... done Removing clearml-webserver ... done Removing clearml-agent-services ... done Removing clearml-apiserver ... done Removing clearml-redis ... done Removing clearml-fileserver ... done Removing clearml-mongo ... done Removing clearml-elastic ... done Removing network clearml_backend Removing network clearml_frontendthen try the commad:
[ec2-user@ip-172-31-26-41 ~]$ sudo docker exec -it clearml-mongo /bin/bash Error: No such container: clearml-mongo
what did I done wrong?

  
  
Posted 3 years ago

[2021-01-24 17:02:25,660] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:02:25,674] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 8ms [2021-01-24 17:02:26,696] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 36ms [2021-01-24 17:02:26,742] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 78ms [2021-01-24 17:02:27,169] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 3ms [2021-01-24 17:02:27,638] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 100ms [2021-01-24 17:02:28,923] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 12ms [2021-01-24 17:02:28,963] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 105ms [2021-01-24 17:02:29,960] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 138ms [2021-01-24 17:02:30,684] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 30ms [2021-01-24 17:02:30,691] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:02:30,707] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 8ms [2021-01-24 17:02:31,611] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 2ms [2021-01-24 17:02:31,738] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id_ex in 26ms [2021-01-24 17:02:31,821] [8] [ERROR] [trains.service_repo] 'list' object has no attribute 'values' Traceback (most recent call last): File "/opt/trains/apiserver/service_repo/service_repo.py", line 273, in handle_call ret = endpoint.func(call, company, call.data_model) File "/opt/trains/apiserver/services/tasks.py", line 197, in get_by_id_ex unprepare_from_saved(call, tasks) File "/opt/trains/apiserver/services/tasks.py", line 349, in unprepare_from_saved artifacts_unprepare_from_saved(fields=data) File "/opt/trains/apiserver/bll/task/artifacts.py", line 43, in artifacts_unprepare_from_saved value=sorted(artifacts.values(), key=itemgetter("key", "mode")), AttributeError: 'list' object has no attribute 'values' [2021-01-24 17:02:31,821] [8] [ERROR] [trains.service_repo] Returned 500 for tasks.get_by_id_ex in 121ms, msg='list' object has no attribute 'values' [2021-01-24 17:02:31,824] [8] [INFO] [trains.service_repo] Returned 200 for events.get_task_log in 119ms [2021-01-24 17:02:32,167] [8] [INFO] [trains.service_repo] Returned 200 for tasks.ping in 5ms [2021-01-24 17:02:32,475] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 87ms [2021-01-24 17:02:32,675] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 77ms [2021-01-24 17:02:32,697] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 28ms [2021-01-24 17:02:32,902] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 63ms [2021-01-24 17:02:34,773] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 98ms [2021-01-24 17:02:35,721] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:02:35,739] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 11ms [2021-01-24 17:02:36,386] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 73ms [2021-01-24 17:02:36,715] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 27ms [2021-01-24 17:02:36,750] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 73ms [2021-01-24 17:02:36,792] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 6ms [2021-01-24 17:02:36,795] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id_ex in 6ms [2021-01-24 17:02:36,933] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id_ex in 154ms [2021-01-24 17:02:37,034] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 88ms [2021-01-24 17:02:37,096] [8] [INFO] [trains.service_repo] Returned 200 for events.get_task_log in 13ms [2021-01-24 17:02:38,642] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 3ms [2021-01-24 17:02:39,320] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 82ms [2021-01-24 17:02:40,108] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 74ms [2021-01-24 17:02:40,694] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 24ms [2021-01-24 17:02:40,758] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 6ms [2021-01-24 17:02:40,771] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 62ms [2021-01-24 17:02:40,781] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 6ms [2021-01-24 17:02:41,263] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_configuration_names in 8ms [2021-01-24 17:02:41,264] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 2ms [2021-01-24 17:02:41,419] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 4ms [2021-01-24 17:02:41,574] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 86ms [2021-01-24 17:02:43,873] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 156ms [2021-01-24 17:02:43,897] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 138ms [2021-01-24 17:02:44,644] [8] [INFO] [trains.non_responsive_tasks_watchdog] Starting cleanup cycle for running tasks last updated before 2021-01-24 15:02:44.644426 [2021-01-24 17:02:44,646] [8] [INFO] [trains.non_responsive_tasks_watchdog] 0 non-responsive tasks found [2021-01-24 17:02:44,646] [8] [INFO] [trains.non_responsive_tasks_watchdog] 0 non-responsive tasks stopped [2021-01-24 17:02:44,686] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 23ms [2021-01-24 17:02:45,795] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:02:45,812] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 11ms [2021-01-24 17:02:46,196] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 122ms [2021-01-24 17:02:47,425] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 73ms [2021-01-24 17:02:48,166] [8] [INFO] [trains.service_repo] Returned 200 for projects.get_all_ex in 3ms [2021-01-24 17:02:48,325] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id_ex in 19ms [2021-01-24 17:02:48,375] [8] [ERROR] [trains.service_repo] 'list' object has no attribute 'values' Traceback (most recent call last): File "/opt/trains/apiserver/service_repo/service_repo.py", line 273, in handle_call ret = endpoint.func(call, company, call.data_model) File "/opt/trains/apiserver/services/tasks.py", line 197, in get_by_id_ex unprepare_from_saved(call, tasks) File "/opt/trains/apiserver/services/tasks.py", line 349, in unprepare_from_saved artifacts_unprepare_from_saved(fields=data) File "/opt/trains/apiserver/bll/task/artifacts.py", line 43, in artifacts_unprepare_from_saved value=sorted(artifacts.values(), key=itemgetter("key", "mode")), AttributeError: 'list' object has no attribute 'values' [2021-01-24 17:02:48,379] [8] [ERROR] [trains.service_repo] Returned 500 for tasks.get_by_id_ex in 109ms, msg='list' object has no attribute 'values' [2021-01-24 17:02:48,454] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 77ms [2021-01-24 17:02:48,687] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 25ms [2021-01-24 17:02:48,769] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 76ms [2021-01-24 17:02:50,709] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 81ms [2021-01-24 17:02:50,824] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:02:50,842] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 8ms [2021-01-24 17:02:51,075] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 71ms [2021-01-24 17:02:52,552] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 176ms [2021-01-24 17:02:52,699] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 36ms [2021-01-24 17:02:52,717] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 52ms [2021-01-24 17:02:53,006] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 120ms [2021-01-24 17:02:53,007] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 115ms [2021-01-24 17:02:54,691] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 27ms [2021-01-24 17:02:54,772] [8] [INFO] [trains.service_repo] Returned 200 for workers.status_report in 12ms [2021-01-24 17:02:54,797] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 73ms [2021-01-24 17:02:55,286] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 77ms [2021-01-24 17:02:55,853] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:02:55,866] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 7ms [2021-01-24 17:02:56,718] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 50ms [2021-01-24 17:02:57,531] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 71ms [2021-01-24 17:02:58,561] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 79ms [2021-01-24 17:02:58,708] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 24ms [2021-01-24 17:02:58,810] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 76ms [2021-01-24 17:02:59,814] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 83ms [2021-01-24 17:03:00,882] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:03:00,901] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 13ms [2021-01-24 17:03:02,081] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 70ms [2021-01-24 17:03:02,216] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 62ms [2021-01-24 17:03:02,712] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 25ms [2021-01-24 17:03:02,736] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 73ms [2021-01-24 17:03:04,280] [8] [INFO] [trains.service_repo] Returned 200 for tasks.ping in 7ms [2021-01-24 17:03:04,524] [8] [INFO] [trains.service_repo] Returned 200 for tasks.get_by_id in 76ms [2021-01-24 17:03:05,913] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_all in 2ms [2021-01-24 17:03:05,934] [8] [INFO] [trains.service_repo] Returned 200 for queues.get_next_task in 14ms [2021-01-24 17:03:06,020] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 98ms [2021-01-24 17:03:06,787] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 96ms [2021-01-24 17:03:06,866] [8] [INFO] [trains.service_repo] Returned 200 for events.add_batch in 178ms

  
  
Posted 3 years ago

It seems the automatic MongoDB migration failed on startup

  
  
Posted 3 years ago

Of course, do that while the server is down

  
  
Posted 3 years ago

No, there was a problem with the particular version migration. The temporary index creation allowed to this and all subsequent migrations to run successfully. So for now your DB is properly aligned with the latest ClearML and the future upgrades should work fine.

  
  
Posted 3 years ago

Hi CooperativeFox72 , there was a typo in the index creation instructions ("comapny" instead of "company"). Please try the following sequence in mongo shell and then starting the apiserver:
use auth db.user.createIndex({"name": 1, "company": 1})

  
  
Posted 3 years ago

SuccessfulKoala55 and AppetizingMouse58 Thanks you very much!!

I have a future question:
Does this fix should harm in future cleraml-server upgrade?
Or what the best practice to upgrade after doing it?

  
  
Posted 3 years ago

someone in my company started a training 😥 , will do it after it will finish.. and will update
Thanks you are the best 🙏

  
  
Posted 3 years ago

I update to the new version 0.16.1 few weeks away and it works using the elastic_upgrade.py

  
  
Posted 3 years ago

Obviously you have to have the server up when you do that... 🙂

  
  
Posted 3 years ago

I did it and still getting the same error 😥

  
  
Posted 3 years ago

Thanks again!! 🙏
You`r the best 🙂

  
  
Posted 3 years ago

then start the server again and see if you get the errors in the log

  
  
Posted 3 years ago

Oh, I'm sorry - how stupid of me...

  
  
Posted 3 years ago

First, go into the MongoDB docker instance using:
sudo docker exec -it clearml-mongo /bin/bashThen, inside the docker, start the MongoDB CLI using:
mongoThen, enter these two commands:
use auth db.user.createIndex({"name": 1, "comapny": 1})

  
  
Posted 3 years ago

Anyway, a quick fix could be to create the mongo index that's failing the imgration

  
  
Posted 3 years ago

Than take it down, and up again

  
  
Posted 3 years ago

does it ok that it looks for files in /opt/trains ? since we move all to /opt/clearml no?
File "/opt/trains/apiserver/mongo/initialize/migration.py"

  
  
Posted 3 years ago

the index creation:
[ec2-user@ip-172-31-26-41 ~]$ sudo docker exec -it clearml-mongo /bin/bash root@3fc365193ed0:/# mongo MongoDB shell version v3.6.5 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 3.6.5 Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see Questions? Try the support group `
Server has startup warnings:
2021-01-25T05:58:37.309+0000 I CONTROL [initandlisten]
2021-01-25T05:58:37.309+0000 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2021-01-25T05:58:37.309+0000 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2021-01-25T05:58:37.309+0000 I CONTROL [initandlisten]

use auth
switched to db auth
db.user.createIndex({"name": 1, "comapny": 1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}

bye
root@3fc365193ed0:/# exit `

  
  
Posted 3 years ago

Is this after you've created the index using the instructions I sent?

  
  
Posted 3 years ago

yep

  
  
Posted 3 years ago
5K Views
31 Answers
3 years ago
one month ago
Tags