Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, Could You Help Me? I’Ve Tried Update Clearml-Server In K8S Old And New Clearml In The Different Namespaces, But After Migrate I Got The Error Error 101 : Inconsistent Data Encountered In Document: Document=Output, Field=Model How It Fix?

Hey, could you help me?
I’ve tried update clearml-server in k8s
Old and new clearml in the different namespaces, but after migrate I got the error

Error 101 : Inconsistent data encountered in document: document=Output, field=model

How it fix?

  
  
Posted 2 years ago
Votes Newest

Answers 13


Many thanks
2 indexes didn’t work. I deleted them and new ones were created automatically.

  
  
Posted 2 years ago

Can you share the modified help/yaml ?

Yep, here in attachment, clearml and pvc

Did you run any specific migration script after the upgrade ?

nope, I’ve copied data from fileservers and elasticsearch plus made mongodump

How many apiserver instances do you have ?

1 apiserver container

How did you configure the elastic container? is it booting?

Standard configuration (clearml.yaml). Elastic works

  
  
Posted 2 years ago

ResponsiveCamel97 is looks like one of the shards in ES is not active, I suggest using ES API to query the cluster status and the reason for the shards status

  
  
Posted 2 years ago

Can you share the modified help/yaml ?
Did you run any specific migration script after the upgrade ?
How many apiserver instances do you have ?
How did you configure the elastic container? is it booting?

  
  
Posted 2 years ago

[2021-06-11 15:24:36,885] [9] [ERROR] [clearml.service_repo] Returned 500 for queues.get_next_task in 60007ms, msg=General data error: err=('1 document(s) failed to index.', [{'index': {'_index': 'queue_metrics_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'PkGr-3kBBPcUBw4n5Acx', 'status': 503, 'error': {'type':..., extra_info=[queue_metrics_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[queue_metrics_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [index {[queue_metrics_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][_doc][PkGr-3kBBPcUBw4n5Acx], source[_na_]}]] [2021-06-11 15:24:39,424] [9] [ERROR] [clearml.__init__] Failed processing worker status report Traceback (most recent call last): File "/opt/clearml/apiserver/bll/workers/__init__.py", line 149, in status_report machine_stats=report.machine_stats, File "/opt/clearml/apiserver/bll/workers/__init__.py", line 416, in _log_stats_to_es es_res = elasticsearch.helpers.bulk(self.es_client, actions) File "/usr/local/lib/python3.6/site-packages/elasticsearch/helpers/actions.py", line 396, in bulk for ok, item in streaming_bulk(client, actions, *args, **kwargs): File "/usr/local/lib/python3.6/site-packages/elasticsearch/helpers/actions.py", line 326, in streaming_bulk **kwargs File "/usr/local/lib/python3.6/site-packages/elasticsearch/helpers/actions.py", line 246, in _process_bulk_chunk for item in gen: File "/usr/local/lib/python3.6/site-packages/elasticsearch/helpers/actions.py", line 185, in _process_bulk_chunk_success raise BulkIndexError("%i document(s) failed to index." % len(errors), errors) elasticsearch.helpers.errors.BulkIndexError: ('8 document(s) failed to index.', [{'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'P0Gr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'cpu', 'metric': 'cpu_temperature', 'variant': '0', 'value': 43.0}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'QEGr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'cpu', 'metric': 'cpu_usage', 'variant': '0', 'value': 3.334}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'QUGr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'disk', 'metric': 'disk_free_home', 'variant': 'total', 'value': 58.1}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'QkGr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'disk', 'metric': 'disk_write', 'variant': 'total', 'value': 0.009}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'Q0Gr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'memory', 'metric': 'memory_free', 'variant': 'total', 'value': 113848.816}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'REGr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'memory', 'metric': 'memory_used', 'variant': 'total', 'value': 13401.186}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'RUGr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'network', 'metric': 'network_rx', 'variant': 'total', 'value': 0.001}}}, {'index': {'_index': 'worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06', '_type': '_doc', '_id': 'RkGr-3kBBPcUBw4n7gce', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2021-06][0]] containing [8] requests]'}, 'data': {'timestamp': 1623417920000, 'worker': 'test:bd28:cpu:2', 'company': 'clearml', 'task': None, 'category': 'network', 'metric': 'network_tx', 'variant': 'total', 'value': 0.001}}}]) [2021-06-11 15:24:39,426] [9] [ERROR] [clearml.service_repo] Returned 500 for workers.status_report in 60008ms, msg=General data error (Failed processing worker status report): err=8 document(s) failed to index.

  
  
Posted 2 years ago

AgitatedDove14 I can try but are you sure this will help?

  
  
Posted 2 years ago

ResponsiveCamel97
could you attach the full log?

  
  
Posted 2 years ago

from which component?

  
  
Posted 2 years ago

webserver 127.0.0.1 - - [11/Jun/2021:14:32:02 +0000] “GET /version.json HTTP/1.1” 304 0 “*/projects/cbe22f65c9b74898b5496c48fffda75b/experiments/3fc89b411cf14240bf1017f17c58916b/execution?columns=selected&columns=type&columns=name&columns=tags&columns=status&columns=project.name&columns=users&columns=started&columns=last_update&columns=last_iteration&columns=parent.name&order=last_update” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)

for example webserver

  
  
Posted 2 years ago

old 0.17
new 1.0.2
partly used helm charts, we are used yaml files from helm, but we rewrote part about pvc and our clearml locate in several nodes

  
  
Posted 2 years ago

Error 101 : Inconsistent data encountered in document: document=Output, field=model

Okay this point to a migration issue from 0.17 to 1.0
First try to upgrade to 1.0 then to 1.0.2
(I would also upgrade a single apiserver instance, once it is done, then you can spin the rest)
Make sense ?

  
  
Posted 2 years ago

HI ResponsiveCamel97
What's the clearml-server version? How do you spin the server on your k8s cluster, helm ?

  
  
Posted 2 years ago

see error in apiserver

  
  
Posted 2 years ago
646 Views
13 Answers
2 years ago
one year ago
Tags