This seems something different not connected to ES. Where do you get these logs?
@<1734020208089108480:profile|WickedHare16> Can you please share an example plot url that is not opened in UI but you can see it in a separate tab?
I do not see any issues in the log. Do you still get errors in the task due to the failure in events.add_batch?
No, it says that it does not detect any problematic shards. Given that output and the absence of the errors in the logs I would expect that you will not get the error anymore
Probably the 9200 port is not mapped from the ES container in the docker compose
The easiest would be to perform "sudo docker exec -it clearml-elastic /bin/bash" and then run the curl command from inside the ES docker
And are you still getting exactly this error?
<500/100: events.add_batch/v1.0 (General data error: err=1 document(s) failed to index., extra_info=[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][0]] containing [index {[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][f3abecd0f46f4bd289e0ac39662fd850], source[{"timestamp":1747654820464,"type":"log","task":"fd3d00d99d88427bbc57...
About the prefix part I think it should not matter. Just put your prefix instead of ' None .<ADDRESS>'
Hi @<1526734383564722176:profile|BoredBat47> , did the last update urls command work for you? I want to update our documentation
While on the host you can run some ES commands to check the shards health and allocations. For example this:
curl -XGET "localhost:9200/_cluster/allocation/explain?pretty"
It may give more clues to the problem
I see now. It seems that the instructions that we provided updated only model urls and there are some more artifacts that need to be handled. Please try running the attached python script from inside your apiserver docker container. The script should fix all the task artifact links in mongo. Copy it to any place inside the running clearml-apiserver container and then run it as following:
python3 fix_mongo_urls.py --mongo-host
--host-source
--host-target http:...
Hi @<1578555761724755968:profile|GrievingKoala83> , the DELETED prefix in the model id means that the original model was already deleted. The reference that you see "__DELETED__63e920aeacb247c890c70e525576474c" does not point to any model but instead a reminder that there was a reference to 63e920aeacb247c890c70e525576474c model here but the model was removed
Hi @<1689446563463565312:profile|SmallTurkey79> , we identified the problem and working on the fix
Then possibly it is another reason. Need to search for in the ES logs
Hi @<1523701601770934272:profile|GiganticMole91> , I do not see any difference that could lead to Elasticsearch from v1.16 not being able to start from the data that was stored in v1.15. Probably some more information can be retrieved from ES logs right after the upgrade and services restart. If there are some reasons the prevent ES from loading the existing data they may be listed in the logs
@<1523701601770934272:profile|GiganticMole91> Is your ES deployment a single node or a cluster? If you compare the elaststicsearch section of the docker compose of your currently working version (1.15.0) and then one that you tried to install (v1.16) do you see any difference?
''' = ' + ' + '
A bit confusing. But this is what linux shell wants if you have single quotes inside double quotes inside outer single quotes
Can you please run 'sudo docker ps' and share the results? I want to see what is the real version of the apiserver image used
Great! Thanks:)
Yeah, they should:) The problem is that they are inside outer single quotes -d'{...}'
Ah, I see. I forgot to escape the single quotes inside script. Please replace the current script source:
"ctx._source.url = ctx._source.url.replace('http://<MY_OLD_ADDRESS>', ' None .<NEW_ADDRESS>')"
With the escaped one:
"ctx._source.url = ctx._source.url.replace('''http://<MY_OLD_ADDRESS>''', ''' None .<NEW_ADDRESS>''')"
Please share you command