FYI, I am training the model again, this time in a project which is not nested, just to rule out any funnies with regards to issues with nested projects.
I checked the apiserver.log
file in /opt/clearml/logs
and this appears to be the related error when I try to publish an experiment:
` [2021-06-07 13:43:40,239] [9] [ERROR] [clearml.service_repo] ValidationError (Task:8a4a13bad8334d8bb53d7edb61671ba9) (setup_shell_script.StringField only accepts string values: ['container'])
Traceback (most recent call last):
File "/opt/clearml/apiserver/bll/task/task_operations.py", line 325, in publish_task
raise ex
File "/opt/clearml/apiserver/bll/task/task_operations.py", line 301, in publish_task
task.save()
File "/usr/local/lib/python3.6/site-packages/mongoengine/document.py", line 392, in save
self.validate(clean=clean)
File "/usr/local/lib/python3.6/site-packages/mongoengine/base/document.py", line 450, in validate
raise ValidationError(message, errors=errors)
mongoengine.errors.ValidationError: ValidationError (Task:8a4a13bad8334d8bb53d7edb61671ba9) (setup_shell_script.StringField only accepts string values: ['container'])
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/clearml/apiserver/service_repo/service_repo.py", line 277, in handle_call
ret = endpoint.func(call, company, call.data_model)
File "/opt/clearml/apiserver/services/tasks.py", line 1131, in publish_many
ids=request.ids,
File "/opt/clearml/apiserver/bll/util.py", line 122, in run_batch_operation
results.append((_id, func(_id)))
File "/opt/clearml/apiserver/bll/task/task_operations.py", line 329, in publish_task
task.save()
File "/usr/local/lib/python3.6/site-packages/mongoengine/document.py", line 392, in save
self.validate(clean=clean)
File "/usr/local/lib/python3.6/site-packages/mongoengine/base/document.py", line 450, in validate
raise ValidationError(message, errors=errors)
mongoengine.errors.ValidationError: ValidationError (Task:8a4a13bad8334d8bb53d7edb61671ba9) (setup_shell_script.StringField only accepts string values: ['container']) `
Hi VivaciousPenguin66 , this looks like an internal error indeed...
Can you check the browser's "Developer Tools/Network" section and see the exactly API call that's failing? (including the payload sent ion the request)
Are you using a self-hosted server? If so, what's the version? I have a feeling you're running v1.0.1 or v1.0.0 (as the "newer version" message on the top indicates). This error looks exactly like what was fixed on v1.0.2... (see https://clear.ml/docs/latest/docs/release_notes/ver_1_0#clearml-server-102 )
Hi SuccessfulKoala55
Thanks for the input.
I was actually about to grab the new docker_compose.yml
and pull the new images.
Weirdly it was working before, so what's changed?
I don't believe I've updated the agents or the clearml sdk on the experiment submission vm either.
I will definitely update the server now, and report back.
Well, I'm not sure, but this error is related to a null
value sent as the task's container
field (which should be perfectly legal, of course)
SuccessfulKoala55
Good news!
It looks like pulling the new clearml-server
version has solved the problem.
I can happily publish models.
Interestingly, I was able to publish models before using this server, so I must have inadvertently updated something that has caused a conflict.