Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone, I Am Updating The Self-Hosted Server To A Public Ip. However, All My Datasets Cannot Be Downloaded Anymore. I Followed Instructions From

Hi everyone, I am updating the self-hosted server to a public IP. However, all my datasets cannot be downloaded anymore. I followed instructions from here , but it's not working. Maybe, I'm not understanding it correctly.
I ran the following code to update the URL.

I tried different replacements strategies (including port numbers and without). I ran this file for all three ports 8008, 8080 and 8081 as well.

# See 


import elasticsearch
from elasticsearch import RequestsHttpConnection

INDEX_URL = "
"

ES_CLIENT = elasticsearch.Elasticsearch(hosts=INDEX_URL, verify_certs=False, ca_certs=False,
                                        connection_class=RequestsHttpConnection, timeout=3000, max_retries=10)

index = "events-training_debug_image-d1bd92a3b039400cbafc60a7a5b1e52b"

q = {
    "script": {
        "source": "ctx._source.url = ctx._source.url.replace('
', '
')",
        "lang": "painless"
    },
    "query": {
        "match_all": {}
    }
}

ES_CLIENT.update_by_query(body=q, index=index)

print('Done')

However, when I'm trying again I am getting an error

2024-10-09 10:48:57,544 - clearml.storage - ERROR - Could not download 
 , err: Failed getting object 10.0.0.12:8081/Project/.datasets/dataset/dataset.f66a70c6cda440dd8fdaccb52d5e9055/artifacts/state/state.json (401): UNAUTHORIZED
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/michel/sandbox/.venv/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1790, in get
    instance = get_instance(dataset_id)
  File "/home/michel/sandbox/.venv/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1702, in get_instance
    raise ValueError("Could not load Dataset id={} state".format(task.id))
ValueError: Could not load Dataset id=f66a70c6cda440dd8fdaccb52d5e9055 state

The updating of the URLs via ES did something, as before I had this message. I did also open the port on the ES docker, of course.

Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f49f0ab0820>, 'Connection to 10.0.0.12 timed out. (connect timeout=30)')': /Project/.datasets/dataset/dataset.f66a70c6cda440dd8fdaccb52d5e9055/artifacts/state/state.json

Any help would be greatly appreciated!

  
  
Posted 2 months ago
Votes Newest

Answers 13


Thank you for the fix:) I will update the script for future usages

  
  
Posted 2 months ago

Could it be that here Failed getting object 10.0.0.12:8081/Esti/ it is without the 'http' part? That I also have to replace all those occurrences?

  
  
Posted 2 months ago

Hi @<1577468611524562944:profile|MagnificentBear85> , the instructions that you followed should fix the addresses of debug images. For artifacts there are different instructions. Please follow the instructions from "For artifacts" item:
https://clear.ml/docs/latest/docs/faq/#debug-images-andor-artifacts-are-not-loading-in-the-u[…]clearml-server-to-a-new-address-how-do-i-fix-this----
You will need to change the $regex:/^s3/ into $regex:/^ http\:\/\/10\.0\.0\.12\:8081 /
And e.uri.replace("s3://<old-bucket-name>/", "s3://<new-bucket-name>/") into e.uri.replace(" None ", "http://<your target address and port here>")

  
  
Posted 2 months ago

Thanks for the quick and helpful answer @<1722061389024989184:profile|ResponsiveKoala38> ! It works. At least, in the sense that I can see my artifacts are updated. However, my datasets are still on the wrong address. How to update those as well?

  
  
Posted 2 months ago

Awesome, thanks very much for this detailed reply! This indeed seemed to have updated every url.
One note - I had to call the mongo host as --mongo-host None

  
  
Posted 2 months ago

Of course, you can see it in the error message that I already shared - but here is another one just in case.

.venv/bin/python -c "from clearml import Dataset; Dataset.get(dataset_project='Esti', dataset_name='bulk_density')"
2024-10-09 18:56:03,137 - clearml.storage - WARNING - Failed getting object size: ValueError('Failed getting object 10.0.0.12:8081/Esti/.datasets/bulk_density/bulk_density.f66a70c6cda440dd8fdaccb52d5e9055/artifacts/state/state.json (401): UNAUTHORIZED')
2024-10-09 18:56:03,245 - clearml.storage - ERROR - Could not download 
 , err: Failed getting object 10.0.0.12:8081/Esti/.datasets/bulk_density/bulk_density.f66a70c6cda440dd8fdaccb52d5e9055/artifacts/state/state.json (401): UNAUTHORIZED
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/michel/sandbox/.venv/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1790, in get
    instance = get_instance(dataset_id)
  File "/home/michel/sandbox/.venv/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1702, in get_instance
    raise ValueError("Could not load Dataset id={} state".format(task.id))
ValueError: Could not load Dataset id=f66a70c6cda440dd8fdaccb52d5e9055 state

Although now I see there is also the word 'artifacts' in the url. Your suggestion didn't update these though - but it did update the paths to my models.

  
  
Posted 2 months ago

Can you please share an example of the datasets wrong address? Where do you see it?

  
  
Posted 2 months ago

I see now. It seems that the instructions that we provided updated only model urls and there are some more artifacts that need to be handled. Please try running the attached python script from inside your apiserver docker container. The script should fix all the task artifact links in mongo. Copy it to any place inside the running clearml-apiserver container and then run it as following:

python3 fix_mongo_urls.py --mongo-host 
 --host-source 
 --host-target http://<your target address and port here>
  
  
Posted 2 months ago

Thanks! We want to add the python script that I sent you to the next version of open source and change the instructions to use this script instead of copying the commands from web

  
  
Posted 2 months ago

Great thanks for the fast and extremely helpful answers!

  
  
Posted 2 months ago

Ok, even weirder now - the model paths seem updated to 172. but I have also the csv's as artifacts that are still at 10.
Any clues @<1722061389024989184:profile|ResponsiveKoala38> ?
image
image

  
  
Posted 2 months ago

Ah okay, this python script is meant to replace all the other scripts? That makes sense then 🙂

  
  
Posted 2 months ago

O yeah, one more thing. The initial link you sent me contains the snippet that is written to file using cat but for me it only works with simply echo on a single line. If I copy from the website, it inserts weird end of line characters that mess it up (at least that's my hypothesis) - so you might want to consider putting a warning on the website or updating to the instruction below

echo 'db.model.find({uri:{$regex:/^http:\/\/10\.0\.0\.12:8081/}}).forEach(function(e,i) { e.uri = e.uri.replace("
","newaddress:port"); db.model.save(e);});' > script.js
  
  
Posted 2 months ago
184 Views
13 Answers
2 months ago
2 months ago
Tags