Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
[Errors When Migrating Clearml Server From Aws To Gcp]

[Errors when migrating ClearML Server from AWS to GCP]

Hi everyone!
As we’re using ClearML quite a bit, we’d love to take it with us when migrating our cloud from AWS to GCP.

However, we’ve run into a few problems:

  • After starting the new ClearML server using a custom GCP image and migrating the data as described here , the new server won’t start up anymore. Outputs in docker-compose show something like below, more stack trace in comment. (Before migrating the data from AWS, the GCP server started up as expected.)
clearml-elastic   | ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];
clearml-elastic   | Likely root cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes
  • If we only migrate the content under /opt/clearml/data/mongo_4/db to the new GCP server, the server starts and the tasks are visible, but console output, scalars and debug samples are still missing
  • Let’s say the new server works and the tasks show as expected with artifacts and outputs, but the artifacts and outputs are still on S3. Is there any way to transfer those from S3 to GCP storage as well and have the new server reference everything from GCP, i.e. replace the paths? We have already migrated the data itself from S3 to GCP storage - the paths are identical, just that the new ones start with “ None ” instead of “ None ”.
  
  
Posted one year ago
Votes Newest

Answers 6


Ok got it 👍

  
  
Posted one year ago

Hi @<1523702496097210368:profile|ScantChimpanzee51> , I think this is more difficult. I think you would need to edit the urls in mongoDB per task/model/dataset

  
  
Posted one year ago

@<1523701070390366208:profile|CostlyOstrich36> thank you, now everything works so far!
Last thing: Is there any way to change all the links in the new ClearML server such that an artifact that was previous under s3://… is now taken from gs://… ? The actual data is already available under the gs:// link of course

  
  
Posted one year ago

Hi @<1523702496097210368:profile|ScantChimpanzee51> , your steps look ok but the error pretty much indicates that there is a folder permissions issue. Please navigate manually to /opt/clearml/data folder and check "ls -al" command what are the user and permissions for the "elastic_7" folder and then enter the elastic_7 folder and check the same for its "nodes" subfolder. If the permissions are correct try restarting the docker and checking if it helps.

  
  
Posted one year ago

To recap, the server started up on GCP as expected before migrating the data over. The migration was done by

  • deleting the current data sudo rm -fR /opt/clearml/data/*
  • unpacking the backup sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
  • setting permissions sudo chown -R 1000:1000 /opt/clearml
  
  
Posted one year ago

More stack trace:

clearml-elastic   | ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];
clearml-elastic   | Likely root cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes
clearml-elastic   |     at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
clearml-elastic   |     at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
clearml-elastic   |     at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
clearml-elastic   |     at java.base/sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:398)
clearml-elastic   |     at java.base/java.nio.file.Files.createDirectory(Files.java:700)
clearml-elastic   |     at java.base/java.nio.file.Files.createAndCheckIsDirectory(Files.java:807)
clearml-elastic   |     at java.base/java.nio.file.Files.createDirectories(Files.java:793)
clearml-elastic   |     at org.elasticsearch.env.NodeEnvironment.lambda$new$0(NodeEnvironment.java:300)
clearml-elastic   |     at org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:224)
clearml-elastic   |     at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:298)
clearml-elastic   |     at org.elasticsearch.node.Node.<init>(Node.java:427)
clearml-elastic   |     at org.elasticsearch.node.Node.<init>(Node.java:309)
clearml-elastic   |     at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234)
clearml-elastic   |     at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234)
clearml-elastic   |     at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434)
clearml-elastic   |     at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166)
clearml-elastic   |     at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157)
clearml-elastic   |     at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
clearml-elastic   |     at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
clearml-elastic   |     at org.elasticsearch.cli.Command.main(Command.java:77)
clearml-elastic   |     at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122)
clearml-elastic   |     at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80)
clearml-elastic   | For complete error details, refer to the log at /usr/share/elasticsearch/logs/clearml.log
clearml-elastic exited with code 1
clearml-apiserver | [2023-06-05 07:19:16,651] [10] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 2 of 4. Waiting for 30sec
  0     0    0     0    0     0      0      0 --:--:--  0:00:34 --:--:--     0curl: (7) Failed to connect to apiserver port 8008: No route to host
  
  
Posted one year ago
922 Views
6 Answers
one year ago
one year ago
Tags
Similar posts