Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
What Is The Best Way To Set S3 As A Files Server? We Have A Clearml Deployment Without A Files Server, But After/During A Training Run Clearml.Metrics Always Fails Due To A Connection Error While Trying To Call <Url>:8081 (We Don'T Have 8081 Because Of


As mentioned above, I've tried both (env and clearml.conf). Here are my configs (I've blacked out urls and creds)

conf file

api { 
    web_server: 

    api_server: 

    files_server: 


    credentials {
        "access_key" = "xyz"
        "secret_key"  = "xyz"
    }
}

Relevant log (it uploads to S3, I can see the artefact fine on clearml's experiment tracker, but it still causes the job to hang)

2023-12-11 16:06:44,008 - clearml.storage - INFO - Uploading: 5325.00MB / 5348.15MB @ 134.86MBs from /tmp/.clearml.upload_model_05djjpwq.tmp
2023-12-11 16:06:44,053 - clearml.storage - INFO - Uploading: 5330.00MB / 5348.15MB @ 113.02MBs from /tmp/.clearml.upload_model_05djjpwq.tmp
2023-12-11 16:06:44,101 - clearml.storage - INFO - Uploading: 5335.00MB / 5348.15MB @ 103.35MBs from /tmp/.clearml.upload_model_05djjpwq.tmp
2023-12-11 16:06:44,148 - clearml.storage - INFO - Uploading: 5340.15MB / 5348.15MB @ 109.98MBs from /tmp/.clearml.upload_model_05djjpwq.tmp
2023-12-11 16:06:44,169 - clearml.storage - INFO - Uploading: 5345.15MB / 5348.15MB @ 240.57MBs from /tmp/.clearml.upload_model_05djjpwq.tmp
2023-12-11 16:06:44,492 - clearml.Task - INFO - Completed model upload to 

Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08674550>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08676560>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a0863eec0>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08675780>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a0863d5d0>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a0863d990>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a0863ef80>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a0863f640>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08676d70>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a0863e6e0>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08677b20>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08676680>, 'Connection to xyz.com timed out. (connect timeout=30)')': /
2023-12-11 16:17:58,911 - clearml.metrics - WARNING - Failed uploading to 
 (HTTPSConnectionPool(host='xyz.com', port=8081): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a086771f0>, 'Connection to xyz.com timed out. (connect timeout=30)')))
2023-12-11 16:17:58,913 - clearml.metrics - WARNING - Failed uploading to 
 (HTTPSConnectionPool(host='xyz.com', port=8081): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1a08677d30>, 'Connection to xyz.com timed out. (connect timeout=30)')))
2023-12-11 16:17:58,914 - clearml.metrics - ERROR - Not uploading 2/5 events because the data upload failed
  
  
Posted 5 months ago
49 Views
0 Answers
5 months ago
5 months ago